1 M L 1 M L 1ML1 M L1ML

By François Pottier and Didier Rémy

1.1 Preliminaries

Names and renaming

Mathematicians and computer scientists use names to refer to arbitrary or unknown objects in the statement of a theorem, to refer to the parameters of a function, etc. Names are convenient because they are understandable by humans; nevertheless, they can be tricky. An in-depth treatment of the difficulties associated with names and renaming is beyond the scope of the present chapter: we encourage the reader to study Gabbay and Pitts' excellent series of papers (Gabbay and Pitts, 2002; Pitts, 2002b). Here, we merely recall a few notions that are used throughout this chapter. Consider, for instance, an inductive definition of the abstract syntax of a simple programming language, the pure λ λ lambda\lambdaλ-calculus:
t ::= z | λ z . t | t t t ::= z | λ z . t | t t t::=z|lambdaz.t|tt\mathrm{t}::=\mathrm{z}|\lambda \mathrm{z} . \mathrm{t}| \mathrm{t} \mathrm{t}t::=z|λz.t|tt
Here, the meta-variable z z z\mathrm{z}z ranges over an infinite set of variables - that is, names-while the meta-variable t ranges over terms. As usual in mathematics, we write "the variable z z zzz " and "the term t t ttt " instead of "the variable denoted by z z zzz " and "the term denoted by t". The above definition states that a term may be a variable z z z\mathbf{z}z, a pair of a variable and a term, written λ λ lambda\lambdaλ z.t, or a pair of terms, written t 1 t 2 t 1 t 2 t_(1)t_(2)t_{1} t_{2}t1t2. However, this is not quite what we need. Indeed, according to this definition, the terms λ z 1 z 1 λ z 1 z 1 lambdaz_(1)*z_(1)\lambda z_{1} \cdot z_{1}λz1z1 and λ z 2 z 2 λ z 2 z 2 lambdaz_(2)*z_(2)\lambda z_{2} \cdot z_{2}λz2z2 are distinct, while we would like them to be a single mathematical object, because we intend λ z . z λ z . z lambda z.z\lambda z . zλz.z to mean "the function that maps z z zzz to z z zzz "- a meaning that is independent of the name z z zzz. To achieve this effect, we complete the above definition by stating that the construction λ λ lambda\lambdaλ z.t binds z z z\mathbf{z}z within t t ttt. One may also say that λ z λ z lambdaz\lambda \mathrm{z}λz is a binder whose scope is t t ttt. Then, λ z . t λ z . t lambda z.t\lambda z . tλz.t is no longer a pair: rather, it is an abstraction of the variable z z zzz within the term t t ttt. Abstractions have the property that the identity of the bound variable does not matter; that is, λ z 1 . z 1 λ z 1 . z 1 lambdaz_(1).z_(1)\lambda z_{1} . z_{1}λz1.z1 and λ z 2 . z 2 λ z 2 . z 2 lambdaz_(2).z_(2)\lambda z_{2} . z_{2}λz2.z2 are the same term. Informally, we say that terms are considered equal modulo α α alpha\alphaα-conversion. Once the position and scope of binders are known, several standard notions follow, such as the set of free variables of a term t t ttt, written f v ( t ) f v ( t ) fv(t)f v(t)fv(t), and the capture-avoiding substitution of a term t 1 t 1 t_(1)t_{1}t1 for a variable z z zzz within a term t 2 t 2 t_(2)t_{2}t2, written [ z t 1 ] t 2 z t 1 t 2 [z|->t_(1)]t_(2)\left[\mathrm{z} \mapsto \mathrm{t}_{1}\right] \mathrm{t}_{2}[zt1]t2. For conciseness, we write f v ( t 1 , t 2 ) f v t 1 , t 2 fv(t_(1),t_(2))f v\left(\mathrm{t}_{1}, \mathrm{t}_{2}\right)fv(t1,t2) for f v ( t 1 ) f v ( t 2 ) f v t 1 f v t 2 fv(t_(1))uu fv(t_(2))f v\left(\mathrm{t}_{1}\right) \cup f v\left(\mathrm{t}_{2}\right)fv(t1)fv(t2). A term is said to be closed when it has no free variables.
A renaming is a total bijective mapping from variables to variables that affects only a finite number of variables. The sole property of a variable is its identity, that is, the fact that it is distinct from other variables. As a result, at a global level, all variables are interchangeable: if a theorem holds in the absence of hypotheses about any particular variable, then any renaming of it holds as well. We often make use of this fact. When proving a theorem T T TTT, we say that a hypothesis H H HHH may be assumed wihout loss of generality (w.l.o.g.) if the theorem T T TTT follows from the theorem H T H T H=>TH \Rightarrow THT via a renaming argument, which is usually left implicit.
If z ¯ 1 z ¯ 1 bar(z)_(1)\bar{z}_{1}z¯1 and z ¯ 2 z ¯ 2 bar(z)_(2)\bar{z}_{2}z¯2 are sets of variables, we write z ¯ 1 # z ¯ 2 z ¯ 1 # z ¯ 2 bar(z)_(1)# bar(z)_(2)\bar{z}_{1} \# \bar{z}_{2}z¯1#z¯2 as a shorthand for z ¯ 1 z ¯ 2 = z ¯ 1 z ¯ 2 = bar(z)_(1)nn bar(z)_(2)=O/\overline{\mathrm{z}}_{1} \cap \overline{\mathrm{z}}_{2}=\varnothingz¯1z¯2=, and say that z ¯ 1 z ¯ 1 bar(z)_(1)\overline{\mathrm{z}}_{1}z¯1 is fresh for z ¯ 2 z ¯ 2 bar(z)_(2)\overline{\mathrm{z}}_{2}z¯2 (or vice-versa). We say that z ¯ z ¯ bar(z)\overline{\mathrm{z}}z¯ is fresh for t t ttt if and only if z ¯ # f v ( t ) z ¯ # f v ( t ) bar(z)#fv(t)\bar{z} \# \mathrm{fv}(\mathrm{t})z¯#fv(t) holds.
In this chapter, we work with several distinct varieties of names: program variables, memory locations, and type variables, the latter of which may be further divided into kinds. We draw names of different varieties from disjoint sets, each of which is infinite.

1.2 What is ML?

The name "ML" appeared during the late seventies. It then referred to a general-purpose programming language that was used as a meta-language (whence its name) within the theorem prover LCF (Gordon, Milner, and Wadsworth, 1979b). Since then, several new programming languages, each of which offers several different implementations, have drawn inspiration from it. So, what does "ML" stand for today?
For a semanticist, "ML" might stand for a programming language featuring first-class functions, data structures built out of products and sums, mutable
memory cells called references, exception handling, automatic memory management, and a call-by-value semantics. This view encompasses the Standard ML (Milner, Tofte, and Harper, 1990) and Caml (Leroy, 2000) families of programming languages. We refer to it as M L M L MLM LML-the-programming-language.
For a type theorist, "ML" might stand for a particular breed of type systems, based on the simply-typed λ λ lambda\lambdaλ-calculus, but extended with a simple form of polymorphism introduced by let declarations. These type systems have decidable type inference; their type inference algorithms crucially rely on first-order unification and can be made efficient in practice. In addition to Standard ML and Caml, this view encompasses programming languages such as Haskell (Hudak, Peyton Jones, Wadler, Boutel, Fairbairn, Fasel, Guzman, Hammond, Hughes, Johnsson, Kieburtz, Nikhil, Partain, and Peterson, 1992) and Clean (Brus, van Eekelen, van Leer, and Plasmeijer, 1987), whose semantics is rather different-indeed, it is pure and lazy-but whose type system fits this description. We refer to it as M L M L MLM LML-the-type-system. It is also referred to as Hindley and Milner's type discipline in the literature.
For us, "ML" might also stand for the particular programming language whose formal definition is given and studied in this chapter. It is a core calculus featuring first-class functions, let declarations, and constants. It is equipped with a call-by-value semantics. By customizing constants and their semantics, one may recover data structures, references, and more. We refer to this particular calculus as M L M L MLM LML-the-calculus.
Why study ML-the-type-system today, such a long time after its initial discovery? One may think of at least two reasons.
First, its treatment in the literature is often cursory, because it is considered either as a simple extension of the simply-typed λ λ lambda\lambdaλ-calculus (TAPL Chapter 9) or as a subset of Girard and Reynolds' System F (TAPL Chapter 23). The former view is supported by the claim that the let construct, which distinguishes ML-the-type-system from the simply-typed λ λ lambda\lambdaλ-calculus, may be understood as a simple textual expansion facility. However, this view only tells part of the story, because it fails to give an account of the principal types property enjoyed by ML-the-type-system, leads to a naïve type inference algorithm whose time complexity is exponential, and breaks down when the language is extended with side effects, such as state or exceptions. The latter view is supported by the fact that every type derivation within ML-the-type-system is also a valid type derivation within an implicity-typed variant of System F. Such a view is correct, but again fails to give an account of type inference for ML-the-type-system, since type inference for System F is undecidable (Wells, 1999).
Second, existing accounts of type inference for ML-the-type-system (Milner, 1978; Damas and Milner, 1982; Tofte, 1988; Leroy, 1992; Lee and Yi, 1998;
Jones, 1999) usually involve heavy manipulations of type substitutions. Such an ubiquitous use of type substitutions is often quite obscure. Furthermore, actual implementations of the type inference algorithm do not explicitly manipulate substitutions; instead, they extend a standard first-order unification algorithm, where terms are updated in place as new equations are discovered (Huet, 1976). Thus, it is hard to tell, from these accounts, how to write an efficient type inference algorithm for ML-the-type-system. Yet, in spite of the increasing speed of computers, efficiency remains crucial when ML-thetype-system is extended with expensive features, such as Objective Caml's object types (Rémy and Vouillon, 1998) or polymorphic methods (Garrigue and Rémy, 1999).
For these reasons, we believe it is worth giving an account of ML-the-typesystem that focuses on type inference and strives to be at once elegant and faithful to an efficient implementation. To achieve these goals, we forego type substitutions and instead put emphasis on constraints, which offer a number of advantanges. First, constraints allow a modular presentation of type inference as the combination of a constraint generator and a constraint solver. Such a decomposition allows reasoning separately about when a program is correct, on the one hand, and how to check whether it is correct, on the other hand. It has long been standard in the setting of the simply-typed λ λ lambda\lambdaλ calculus (TAPL Chapter 22), but, to the best of our knowledge, has never been proposed for ML-the-type-system. Second, it is often natural to define and implement the solver as a constraint rewriting system. Then, the constraint language allows reasoning not only about correctness-is every rewriting step meaning-preserving?-but also about low-level implementation details, since constraints are the data structures manipulated throughout the type inference process. For instance, describing unification in terms of multiequations (Jouannaud and Kirchner, 1991) allows reasoning about the sharing of nodes in memory, which a substitution-based approach cannot account for. Last, constraints are more general than type substitutions, and allow describing many extensions of ML-the-type-system, among which extensions with recursive types, rows, subtyping, first-order unification under a mixed prefix, and more.
Before delving into the details of this new presentation of ML-the-typesystem, however, it is worth recalling its standard definition. Thus, in what follows, we first define the syntax and operational semantics of the programming language ML-the-calculus, and equip it with a type system, known as Damas and Milner's type system.
Figure 1-1: Syntax of ML-the-calculus

ML-the-calculus

The syntax of ML-the-calculus is defined in Figure 1-1. It is made up of several syntactic categories.
Identifiers group several kinds of names that may be referenced in a program: variables, memory locations, and constants. We let x x x\mathrm{x}x and y y y\mathrm{y}y range over identifiers. Variables-sometimes called program variables to avoid ambiguity — are names that may be bound to values using λ λ lambda\lambdaλ or let binding forms; in other words, they are names for function parameters or local definitions. We let z z zzz and f f fff range over program variables. We sometimes write _ for a program variable that does not occur free within its scope: for instance, λ _.t stands for λ λ _.t stands for  λ lambda_("_.t stands for ")lambda\lambda_{\text {_.t stands for }} \lambdaλ_.t stands for λ z.t, provided z z z\mathrm{z}z is fresh for t t ttt. Memory locations are names that represent memory addresses. By convention, memory locations never appear in source programs, that is, programs that are submitted to a compiler. They only appear during execution, when new memory blocks are allocated. Constants are fixed names for primitive values and operations, such as integer literals and integer arithmetic operations. Constants are elements of a finite or infinite set Q Q Q\mathcal{Q}Q. They are never subject to α α alpha\alphaα-conversion. Program variables, memory locations, and constants belong to distinct syntactic classes and may never be confused.
The set of constants Q Q Q\mathcal{Q}Q is kept abstract, so most of our development is independent of its concrete definition. We assume that every constant c c c\mathrm{c}c has a nonnegative integer arity a ( c ) a ( c ) a(c)a(\mathrm{c})a(c). We further assume that Q Q Q\mathcal{Q}Q is partitioned into subsets of constructors Q + Q + Q^(+)\mathcal{Q}^{+}Q+and destructors Q Q Q^(-)\mathcal{Q}^{-}Q. Constructors and destructors differ in that the former are used to form values, while the latter are used to
operate on values.
1.2.1 ExAmple [IntEgERs]: For every integer n n nnn, one may introduce a nullary constructor n ^ n ^ hat(n)\hat{n}n^. In addition, one may introduce a binary destructor + ^ + ^ hat(+)\hat{+}+^, whose applications are written infix, so t 1 + ^ t 2 t 1 + ^ t 2 t_(1) hat(+)t_(2)t_{1} \hat{+} t_{2}t1+^t2 stands for the double application + ^ t 1 t 2 + ^ t 1 t 2 hat(+)t_(1)t_(2)\hat{+} t_{1} t_{2}+^t1t2 of the destructor + ^ + ^ hat(+)\hat{+}+^ to the expressions t 1 t 1 t_(1)t_{1}t1 and t 2 t 2 t_(2)t_{2}t2.
Expressions - also known as program terms or programs - are the main syntactic category. Indeed, unlike procedural languages such as C C C\mathrm{C}C and Java, functional languages, including ML-the-programming-language, suppress the distinction between expressions and statements. Expressions include identifiers, λ λ lambda\lambdaλ-abstractions, applications, and local definitions. The λ λ lambda\lambdaλ-abstraction λ λ lambda\lambdaλ z.t represents the function of one parameter named z z zzz whose result is the expression t t ttt, or, in other words, the function that maps z z zzz to t t ttt. Note that the variable z z zzz is bound within the term t t ttt, so (for instance) λ z 1 z 1 λ z 1 z 1 lambdaz_(1)*z_(1)\lambda z_{1} \cdot z_{1}λz1z1 and λ z 2 z 2 λ z 2 z 2 lambdaz_(2)*z_(2)\lambda z_{2} \cdot z_{2}λz2z2 are the same object. The application t 1 t 2 t 1 t 2 t_(1)t_(2)t_{1} t_{2}t1t2 represents the result of calling the function t 1 t 1 t_(1)t_{1}t1 with actual parameter t 2 t 2 t_(2)t_{2}t2, or, in other words, the result of applying t 1 t 1 t_(1)t_{1}t1 to t 2 t 2 t_(2)t_{2}t2. Application is left-associative, that is, t 1 t 2 t 3 t 1 t 2 t 3 t_(1)t_(2)t_(3)t_{1} t_{2} t_{3}t1t2t3 stands for ( t 1 t 2 ) t 3 t 1 t 2 t 3 (t_(1)t_(2))t_(3)\left(t_{1} t_{2}\right) t_{3}(t1t2)t3. The construct let z = t 1 z = t 1 z=t_(1)z=t_{1}z=t1 in t 2 t 2 t_(2)t_{2}t2 represents the result of evaluating t 2 t 2 t_(2)t_{2}t2 after binding the variable z z zzz to t 1 t 1 t_(1)t_{1}t1. Note that the variable z z zzz is bound within t 2 t 2 t_(2)t_{2}t2, but not within t 1 t 1 t_(1)t_{1}t1, so for instance let z 1 = z 1 z 1 = z 1 z_(1)=z_(1)z_{1}=z_{1}z1=z1 in z 1 z 1 z_(1)z_{1}z1 and let z 2 = z 1 z 2 = z 1 z_(2)=z_(1)z_{2}=z_{1}z2=z1 in z 2 z 2 z_(2)z_{2}z2 are the same object. The construct let z = t 1 z = t 1 z=t_(1)z=t_{1}z=t1 in t 2 t 2 t_(2)t_{2}t2 has the same meaning as ( λ z . t 2 ) t 1 λ z . t 2 t 1 (lambda z.t_(2))t_(1)\left(\lambda z . t_{2}\right) t_{1}(λz.t2)t1, but is dealt with in a more flexible way by ML-the-type-system. To sum up, the syntax of ML-the-calculus is that of the pure λ λ lambda\lambdaλ-calculus, extended with memory locations, constants, and the let construct.
Values form a subset of expressions. They are expressions whose evaluation is completed. Values include identifiers, λ λ lambda\lambdaλ-abstractions, and applications of constants, of the form c v 1 v k c v 1 v k cv_(1)dotsv_(k)\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k}cv1vk, where k k kkk does not exceed c c c\mathrm{c}c 's arity if c c c\mathrm{c}c is a constructor, and k k kkk is smaller than c's arity if c c c\mathrm{c}c is a destructor. In what follows, we are often interested in closed values, that is, values that do not contain any free program variables. We use the meta-variables v v vvv and w w www for values.
1.2.2 Example: The integer literals , 1 ^ , 0 ^ , 1 ^ , , 1 ^ , 0 ^ , 1 ^ , dots, widehat(-1), hat(0), hat(1),dots\ldots, \widehat{-1}, \hat{0}, \hat{1}, \ldots,1^,0^,1^, are nullary constructors, so they are values. Integer addition + ^ + ^ hat(+)\hat{+}+^ is a binary destructor, so it is a value, and so is every partial application + ^ v + ^ v hat(+)v\hat{+} \mathrm{v}+^v. Thus, both + ^ 1 ^ + ^ 1 ^ hat(+) hat(1)\hat{+} \hat{1}+^1^ and + ^ + ^ + ^ + ^ hat(+) hat(+)\hat{+} \hat{+}+^+^ are values. An application of + ^ + ^ hat(+)\hat{+}+^ to two values, such as 2 ^ + ^ 2 ^ 2 ^ + ^ 2 ^ hat(2) hat(+) hat(2)\hat{2} \hat{+} \hat{2}2^+^2^, is not a value.
1.2.3 EXAmple [Pairs]: Let ( , ) ( , ) (*,*)(\cdot, \cdot)(,) be a binary constructor. If t 1 t 1 t_(1)t_{1}t1 are t 2 t 2 t_(2)t_{2}t2 are expressions, then the double application ( , ) t 1 t 2 ( , ) t 1 t 2 (*,*)t_(1)t_(2)(\cdot, \cdot) t_{1} t_{2}(,)t1t2 may be called the pair of t 1 t 1 t_(1)t_{1}t1 and t 2 t 2 t_(2)t_{2}t2, and may be written ( t 1 , t 2 ) t 1 , t 2 (t_(1),t_(2))\left(t_{1}, t_{2}\right)(t1,t2). By the definition above, ( t 1 , t 2 ) t 1 , t 2 (t_(1),t_(2))\left(t_{1}, t_{2}\right)(t1,t2) is a value if and only if t 1 t 1 t_(1)t_{1}t1 and t 2 t 2 t_(2)t_{2}t2 are both values.
Stores are finite mappings from memory locations to closed values. A store μ μ mu\muμ represents what is usually called a heap, that is, a collection of data structures,
each of which is allocated at a particular address in memory and may contain pointers to other elements of the heap. ML-the-programming-language allows overwriting the contents of an existing memory block-an operation sometimes referred to as a side effect. In the operational semantics, this effect is achieved by mapping an existing memory location to a new value. We write O/\varnothing for the empty store. We write μ [ m v ] μ [ m v ] mu[m|->v]\mu[m \mapsto \mathrm{v}]μ[mv] for the store that maps m m mmm to v v v\mathrm{v}v and otherwise coincides with μ μ mu\muμ. When μ μ mu\muμ and μ μ mu^(')\mu^{\prime}μ have disjoint domains, we write μ μ μ μ mumu^(')\mu \mu^{\prime}μμ for their union. We write dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ) for the domain of μ μ mu\muμ and range ( μ ) ( μ ) (mu)(\mu)(μ) for the set of memory locations that appear in its codomain.
The operational semantics of a purely functional language, such as the pure λ λ lambda\lambdaλ-calculus, may be defined as a rewriting system on expressions. Because MLthe-calculus has side effects, however, we define its operational semantics as a rewriting system on configurations. A configuration t / μ t / μ t//mut / \mut/μ is a pair of an expression t t ttt and a store μ μ mu\muμ. The memory locations in the domain of μ μ mu\muμ are considered bound within t / μ t / μ t//mut / \mut/μ, so (for instance) m 1 / ( m 1 0 ^ ) m 1 / m 1 0 ^ m_(1)//(m_(1)|->( hat(0)))m_{1} /\left(m_{1} \mapsto \hat{0}\right)m1/(m10^) and m 2 / ( m 2 0 ^ ) m 2 / m 2 0 ^ m_(2)//(m_(2)|->( hat(0)))m_{2} /\left(m_{2} \mapsto \hat{0}\right)m2/(m20^) are the same object. In what follows, we are often interested in closed configurations, that is, configurations t / μ t / μ t//mut / \mut/μ such that t t ttt has no free program variables and every memory location that appears within t t ttt or within the range of μ μ mu\muμ is in the domain of μ μ mu\muμ. If t t ttt is a source program, its evaluation begins within an empty store-that is, with the configuration t / t / t//O/t / \varnothingt/. Because, by convention, source programs do not contain memory locations, this is a closed configuration. Furthermore, we shall see that all reducts of a closed configuration are closed as well. Please note that, instead of separating expressions and stores, it is possible to make store fragments part of the syntax of expressions; this idea, proposed in (Crank and Felleisen, 1991), is reminiscent of the encoding of reference cells in process calculi (Turner, 1995; Fournet and Gonthier, 1996).
A context is an expression where a single subexpression has been replaced with a hole, written [. Evaluation contexts form a strict subset of contexts. In an evaluation context, the hole is meant to highlight a point in the program where it is valid to apply a reduction rule. Thus, the definition of evaluation contexts determines a reduction strategy: it tells where and in what order reduction steps may occur. For instance, the fact that λ z λ z lambda z\lambda zλz. ] ] ]]] is not an evaluation context means that the body of a function is never evaluated-that is, not until the function is applied, see R-BETA below. The fact that t E t E tEt \mathcal{E}tE is an evaluation context only if t t ttt is a value means that, to evaluate an application t 1 t 2 t 1 t 2 t_(1)t_(2)t_{1} t_{2}t1t2, one should fully evaluate t 1 t 1 t_(1)t_{1}t1 before attempting to evaluate t 2 t 2 t_(2)t_{2}t2. More generally, in the case of a multiple application, it means that arguments should be evaluated from left to right. Of course, other choices could be made: for instance, defining E ::= | t E | E v E ::= | t E | E v E::=dots|tE|Ev∣dots\mathcal{E}::=\ldots|\mathrm{t} \mathcal{E}| \mathcal{E} \mathrm{v} \mid \ldotsE::=|tE|Ev would enforce a right-to-left evaluation order, while defining E ::= E ::= E::=dots∣\mathcal{E}::=\ldots \midE::= t E | E t | E | E t | E|Et|dots\mathcal{E}|\mathcal{E} \mathrm{t}| \ldotsE|Et| would leave the evaluation order unspecified, effectively allowing reduction to alternate between
let z = v z = v z=vz=vz=v in t [ z v ] t t [ z v ] t t longrightarrow[z|->v]tt \longrightarrow[z \mapsto v] tt[zv]t
(R-BETA)
(R-LET)
(R-BETA) (R-LET)| (R-BETA) | | :--- | | (R-LET) |
t / μ t / μ t / μ t / μ t//mu longrightarrowt^(')//mu^(')\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}t/μt/μ
dom ( μ ) # dom ( μ ) dom μ # dom μ dom(mu^(''))#dom(mu^('))\operatorname{dom}\left(\mu^{\prime \prime}\right) \# \operatorname{dom}\left(\mu^{\prime}\right)dom(μ)#dom(μ)
range ( μ ) # dom ( μ μ ) t / μ μ t / μ μ range μ # dom μ μ t / μ μ t / μ μ (range(mu^(''))#dom(mu^(')\\mu))/(t//mumu^('')longrightarrowt^(')//mu^(')mu^(''))\frac{\operatorname{range}\left(\mu^{\prime \prime}\right) \# \operatorname{dom}\left(\mu^{\prime} \backslash \mu\right)}{\mathrm{t} / \mu \mu^{\prime \prime} \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime} \mu^{\prime \prime}}range(μ)#dom(μμ)t/μμt/μμ
t//mu longrightarrowt^(')//mu^(') dom(mu^(''))#dom(mu^(')) (range(mu^(''))#dom(mu^(')\\mu))/(t//mumu^('')longrightarrowt^(')//mu^(')mu^(''))| $\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}$ | | :---: | | $\operatorname{dom}\left(\mu^{\prime \prime}\right) \# \operatorname{dom}\left(\mu^{\prime}\right)$ | | $\frac{\operatorname{range}\left(\mu^{\prime \prime}\right) \# \operatorname{dom}\left(\mu^{\prime} \backslash \mu\right)}{\mathrm{t} / \mu \mu^{\prime \prime} \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime} \mu^{\prime \prime}}$ |
(R-ExtEnd)
t / μ δ t / μ t / μ t / μ t / μ δ t / μ t / μ t / μ (t//murarr"delta"t^(')//mu^('))/(t//mu longrightarrowt^(')//mu^('))\frac{\mathrm{t} / \mu \xrightarrow{\delta} \mathrm{t}^{\prime} / \mu^{\prime}}{\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}}t/μδt/μt/μt/μ (R-Delta) t / μ t / μ E [ t ] / μ E [ t ] / μ t / μ t / μ E [ t ] / μ E t / μ (t//mu longrightarrowt^(')//mu^('))/(E[t]//mu longrightarrowE[t^(')]//mu^('))\frac{\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}}{\mathcal{E}[\mathrm{t}] / \mu \longrightarrow \mathcal{E}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}}t/μt/μE[t]/μE[t]/μ (R-CONTEXT)
let z=v in t longrightarrow[z|->v]t "(R-BETA) (R-LET)" "t//mu longrightarrowt^(')//mu^(') dom(mu^(''))#dom(mu^(')) (range(mu^(''))#dom(mu^(')\\mu))/(t//mumu^('')longrightarrowt^(')//mu^(')mu^(''))" (R-ExtEnd) (t//murarr"delta"t^(')//mu^('))/(t//mu longrightarrowt^(')//mu^(')) (R-Delta) (t//mu longrightarrowt^(')//mu^('))/(E[t]//mu longrightarrowE[t^(')]//mu^(')) (R-CONTEXT)| let $z=v$ in $t \longrightarrow[z \mapsto v] t$ | (R-BETA) <br> (R-LET) | $\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}$ <br> $\operatorname{dom}\left(\mu^{\prime \prime}\right) \# \operatorname{dom}\left(\mu^{\prime}\right)$ <br> $\frac{\operatorname{range}\left(\mu^{\prime \prime}\right) \# \operatorname{dom}\left(\mu^{\prime} \backslash \mu\right)}{\mathrm{t} / \mu \mu^{\prime \prime} \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime} \mu^{\prime \prime}}$ | (R-ExtEnd) | | :---: | :---: | :---: | :---: | | $\frac{\mathrm{t} / \mu \xrightarrow{\delta} \mathrm{t}^{\prime} / \mu^{\prime}}{\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}}$ | (R-Delta) | $\frac{\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}}{\mathcal{E}[\mathrm{t}] / \mu \longrightarrow \mathcal{E}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}}$ | (R-CONTEXT) |
Figure 1-2: Semantics of ML-the-calculus
both subexpressions, and making evaluation nondeterministic. The fact that let z = v z = v z=v\mathrm{z}=\mathrm{v}z=v in E E E\mathcal{E}E is not an evaluation context means that the body of a local definition is never evaluated - that is, not until the definition itself is reduced, see R-LET below. We write E [ t ] E [ t ] E[t]\mathcal{E}[\mathrm{t}]E[t] for the expression obtained by replacing the hole in E E E\mathcal{E}E with the expression t t ttt.
Figure 1-2 defines first a relation longrightarrow\longrightarrow between configurations, then a relation longrightarrow\longrightarrow between closed configurations. If t / μ t / μ t / μ t / μ t//mu longrightarrowt^(')//mu^(')\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}t/μt/μ or t / μ t / μ t / μ t / μ t//mu longrightarrowt^(')//mu^(')\mathrm{t} / \mu \longrightarrow \mathrm{t}^{\prime} / \mu^{\prime}t/μt/μ holds, then we say that the configuration t / μ t / μ t//mu\mathrm{t} / \mut/μ reduces to the configuration t / μ t / μ t^(')//mu^(')\mathrm{t}^{\prime} / \mu^{\prime}t/μ; the ambiguity involved in this definition is benign. If t / μ t / μ t / μ t / μ t//mu longrightarrowt^(')//mut / \mu \longrightarrow t^{\prime} / \mut/μt/μ holds for every store μ μ mu\muμ, then we write t t t t tlongrightarrowt^(')\mathrm{t} \longrightarrow \mathrm{t}^{\prime}tt and say that the reduction is pure.
The key reduction rule is R R R\mathrm{R}R-BETA, which states that a function application ( λ z . t ) v ( λ z . t ) v (lambda z.t)v(\lambda z . t) v(λz.t)v reduces to the function body, namely t t ttt, where every occurrence of the formal argument z z zzz has been replaced with the actual argument v v vvv. The λ λ lambda\lambdaλ construct, which prevented the function body t from being evaluated, disappears, so the new term may (in general) be further reduced. Because ML-the-calculus adopts a call-by-value strategy, rule R-BETA is applicable only if the actual argument is a value v. In other words, a function cannot be invoked until its actual argument has been fully evaluated. Rule R-LET is very similar to R R R\mathrm{R}R-BETA. Indeed, it specifies that let z = v z = v z=v\mathrm{z}=\mathrm{v}z=v in t t t\mathrm{t}t has the same behavior, with respect to reduction, as ( λ z . t ) v ( λ z . t ) v (lambda z.t)v(\lambda z . t) v(λz.t)v. We remark that substitution of a value for a program variable throughout a term is expensive, so R-BETA and R-LET are never implemented literally: they are only a simple specification. Actual implementations usually employ runtime environments, which may be understood as a form of explicit substitutions (Abadi, Cardelli, Curien, and Lévy, 1991). Please note that our choice of a call-by-value reduction strategy is fairly arbitrary, and has essentially no impact on the type system; the programming language Haskell (Hudak, Peyton Jones, Wadler, Boutel, Fairbairn, Fasel, Guzman, Hammond, Hughes, Johnsson, Kieburtz,
Nikhil, Partain, and Peterson, 1992), whose reduction strategy is known as lazy or call-by-need, also relies on Hindley and Milner's type discipline.
Rule R-DELTA describes the semantics of constants. It merely states that a certain relation δ δ rarr"delta"\xrightarrow{\delta}δ is a subset of longrightarrow\longrightarrow. Of course, since the set of constants is unspecified, the relation δ δ rarr"delta"\xrightarrow{\delta}δ must be kept abstract as well. We require that, if t / μ δ t / μ t / μ δ t / μ t//murarr"delta"t^(')//mu^(')\mathrm{t} / \mu \xrightarrow{\delta} \mathrm{t}^{\prime} / \mu^{\prime}t/μδt/μ holds, then
(i) t t t\mathrm{t}t is of the form c v 1 v n c v 1 v n cv_(1)dotsv_(n)\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{n}cv1vn, where c c c\mathrm{c}c is a destructor of arity n n nnn; and
(ii) dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ) is a subset of dom ( μ ) dom μ dom(mu^('))\operatorname{dom}\left(\mu^{\prime}\right)dom(μ).
Condition (i) ensures that δ δ delta\deltaδ-reduction concerns full applications of destructors only, and that these are evaluated in accordance with the call-by-value strategy. Condition (ii) ensures that δ δ delta\deltaδ-reduction may allocate new memory locations, but not deallocate existing locations. In particular, a "garbage collection" operator, which destroys unreachable memory cells, cannot be made available as a constant. Doing so would not make much sense anyway in the presence of R-EXTEND, which states that any valid reduction is also valid in a larger store. Condition (ii) allows proving that, if t / μ t / μ t//mut / \mut/μ reduces to t / μ t / μ t^(')//mu^(')t^{\prime} / \mu^{\prime}t/μ, then dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ) is a subset of dom ( μ ) dom μ dom(mu^('))\operatorname{dom}\left(\mu^{\prime}\right)dom(μ); this is left as an exercise to the reader.
1.2.4 Example [Integers, Continued]: The operational semantics of integer addition may be defined as follows:
(R-ADD) k ^ 1 + ^ k ^ 2 δ k 1 + k 2 ^ (R-ADD) k ^ 1 + ^ k ^ 2 δ k 1 + k 2 ^ {:(R-ADD) hat(k)_(1) hat(+) hat(k)_(2)rarr"delta" widehat(k_(1)+k_(2)):}\begin{equation*} \hat{k}_{1} \hat{+} \hat{k}_{2} \xrightarrow{\delta} \widehat{k_{1}+k_{2}} \tag{R-ADD} \end{equation*}(R-ADD)k^1+^k^2δk1+k2^
The left-hand term is the double application + ^ k ^ 1 k ^ 2 + ^ k ^ 1 k ^ 2 hat(+) hat(k)_(1) hat(k)_(2)\hat{+} \hat{k}_{1} \hat{k}_{2}+^k^1k^2, while the right-hand term is the integer literal k ^ k ^ hat(k)\hat{k}k^, where k k kkk is the sum of k 1 k 1 k_(1)k_{1}k1 and k 2 k 2 k_(2)k_{2}k2. The distinction between object level and meta level (that is, between k ^ k ^ hat(k)\hat{k}k^ and k k kkk ) is needed here to avoid ambiguity.
1.2.5 Example [Pairs, continued]: In addition to the pair constructor defined in Example 1.2.3, we may introduce two destructors π 1 π 1 pi_(1)\pi_{1}π1 and π 2 π 2 pi_(2)\pi_{2}π2 of arity 1 . We may define their operational semantics as follows, for i { 1 , 2 } i { 1 , 2 } i in{1,2}i \in\{1,2\}i{1,2} :
(R-PROJ) π i ( v 1 , v 2 ) δ v i (R-PROJ) π i v 1 , v 2 δ v i {:(R-PROJ)pi_(i)(v_(1),v_(2))rarr"delta"v_(i):}\begin{equation*} \pi_{i}\left(\mathrm{v}_{1}, \mathrm{v}_{2}\right) \xrightarrow{\delta} \mathrm{v}_{i} \tag{R-PROJ} \end{equation*}(R-PROJ)πi(v1,v2)δvi
Thus, our treatment of constants is general enough to account for pair construction and destruction; we need not build these features explicitly into the language.
1.2.6 Exercise [Booleans, Recommended, ******\star \star ]: Let true and false be nullary constructors. Let if be a ternary destructor. Extend the operational semantics with
(R-TRUE) if true v 1 v 2 δ v 1 (R-TRUE)  if true  v 1 v 2 δ v 1 {:(R-TRUE)" if true "v_(1)v_(2)rarr"delta"v_(1):}\begin{equation*} \text { if true } \mathrm{v}_{1} \mathrm{v}_{2} \xrightarrow{\delta} \mathrm{v}_{1} \tag{R-TRUE} \end{equation*}(R-TRUE) if true v1v2δv1
(R-FALSE) if false v 1 v 2 δ v 2 (R-FALSE)  if false  v 1 v 2 δ v 2 {:(R-FALSE)" if false "v_(1)v_(2)rarr"delta"v_(2):}\begin{equation*} \text { if false } \mathrm{v}_{1} \mathrm{v}_{2} \xrightarrow{\delta} \mathrm{v}_{2} \tag{R-FALSE} \end{equation*}(R-FALSE) if false v1v2δv2
Let us use the syntactic sugar if t 0 t 0 t_(0)t_{0}t0 then t 1 t 1 t_(1)t_{1}t1 else t 2 t 2 t_(2)t_{2}t2 for the triple application of if t 0 t 1 t 2 t 0 t 1 t 2 t_(0)t_(1)t_(2)t_{0} t_{1} t_{2}t0t1t2. Explain why these definitions do not quite provide the expected behavior. Without modifying the semantics of if, suggest a new definition of the syntactic sugar if t 0 t 0 t_(0)t_{0}t0 then t 1 t 1 t_(1)t_{1}t1 else t 2 t 2 t_(2)t_{2}t2 that corrects the problem.
1.2.7 Example [Sums]: Booleans may in fact be viewed as a special case of the more general concept of sum. Let i n j 1 i n j 1 inj_(1)i n j_{1}inj1 and i n j 2 i n j 2 inj_(2)i n j_{2}inj2 be unary constructors, called respectively left and right injections. Let case be a ternary destructor, whose semantics is defined as follows, for i { 1 , 2 } i { 1 , 2 } i in{1,2}i \in\{1,2\}i{1,2} :
(R-CASE) case ( i n j i v ) v 1 v 2 δ v i v (R-CASE) case i n j i v v 1 v 2 δ v i v {:(R-CASE)case(inj_(i)v)v_(1)v_(2)rarr"delta"v_(i)v:}\begin{equation*} \operatorname{case}\left(\mathrm{inj}_{i} \mathrm{v}\right) \mathrm{v}_{1} \mathrm{v}_{2} \xrightarrow{\delta} \mathrm{v}_{i} \mathrm{v} \tag{R-CASE} \end{equation*}(R-CASE)case(injiv)v1v2δviv
Here, the value i n j i v i n j i v inj_(i)vi n j_{i} \mathrm{v}injiv is being scrutinized, while the values v 1 v 1 v_(1)\mathrm{v}_{1}v1 and v 2 v 2 v_(2)\mathrm{v}_{2}v2, which are typically functions, represent the two arms of a standard case construct. The rule selects an appropriate arm (here, v i v i v_(i)\mathrm{v}_{i}vi ) based on whether the value under scrutiny was formed using a left or right injection. The arm v i v i v_(i)\mathrm{v}_{i}vi is executed and given access to the data carried by the injection (here, v).
1.2.8 EXERcISE [ , ] [ , ] [***,↛][\boldsymbol{\star}, \nrightarrow][,] : Explain how to encode true, f f fff al se and the if construct in terms of sums. Check that the behavior of R-TRUE and R-FALSE is properly emulated.
1.2.9 ExAmple [REFEREnces]: Let ref and ! be unary destructors. Let := := :=:=:= be a binary destructor. We write t 1 := t 2 t 1 := t 2 t_(1):=t_(2)t_{1}:=t_{2}t1:=t2 for the double application := t 1 t 2 := t 1 t 2 :=t_(1)t_(2):=t_{1} t_{2}:=t1t2. Define the operational semantics of these three destructors as follows:
(R-AssiGN) ref v / δ m / ( m v ) if m is fresh for v ( R R E F ) ! m / ( m v ) δ v / ( m v ) (R-DEREF) m := v / ( m v 0 ) δ v / ( m v ) (R-AssiGN) (R-AssiGN)  ref  v / δ m / ( m v )  if  m  is fresh for  v ( R R E F ) ! m / ( m v ) δ v / ( m v )  (R-DEREF)  m := v / m v 0 δ v / ( m v )  (R-AssiGN)  {:(R-AssiGN){:[" ref "v//O/rarr"delta"m//(m|->v)" if "m" is fresh for "v,(R-REF)],[!m//(m|->v)rarr"delta"v//(m|->v)," (R-DEREF) "],[m:=v//(m|->v_(0))rarr"delta"v//(m|->v)," (R-AssiGN) "]:}:}\begin{array}{rr} \text { ref } \mathrm{v} / \varnothing \xrightarrow{\delta} m /(m \mapsto \mathrm{v}) \text { if } m \text { is fresh for } \mathrm{v} & (\mathrm{R}-\mathrm{REF}) \\ ! m /(m \mapsto \mathrm{v}) \xrightarrow{\delta} \mathrm{v} /(m \mapsto \mathrm{v}) & \text { (R-DEREF) } \\ m:=\mathrm{v} /\left(m \mapsto \mathrm{v}_{0}\right) \xrightarrow{\delta} \mathrm{v} /(m \mapsto \mathrm{v}) & \text { (R-AssiGN) } \tag{R-AssiGN} \end{array}(R-AssiGN) ref v/δm/(mv) if m is fresh for v(RREF)!m/(mv)δv/(mv) (R-DEREF) m:=v/(mv0)δv/(mv) (R-AssiGN) 
According to R-REF, evaluating ref v v vvv allocates a fresh memory location m m mmm and binds v v v\mathrm{v}v to it. Because configurations are considered equal up to α α alpha-\alpha-α conversion of memory locations, the choice of the name m m mmm is irrelevant, provided it is chosen fresh for v v v\mathrm{v}v, so as to prevent inadvertent capture of the memory locations that appear free within v. By R-DEREF, evaluating ! m ! m !m! m!m returns the value bound to the memory location m m mmm within the current store. By R R R\mathrm{R}R-Assign, evaluating m := v m := v m:=vm:=\mathrm{v}m:=v discards the value v 0 v 0 v_(0)\mathrm{v}_{0}v0 currently bound to m m mmm and produces a new store where m m mmm is bound to v. Here, the value returned by the assignment m := v m := v m:=vm:=\mathrm{v}m:=v is v v v\mathrm{v}v itself; in ML-the-programming-language, it is usually a nullary constructor (), pronounced unit.
1.2.10 EXAmple [REcursion]: Let fix be a binary destructor, whose operational semantics is:
(R-FIX) f i x v 1 v 2 δ v 1 ( f i x 1 ) v 2 (R-FIX) f i x v 1 v 2 δ v 1 f i x 1 v 2 {:(R-FIX)fixv_(1)v_(2)rarr"delta"v_(1)(fix_(1))v_(2):}\begin{equation*} \mathrm{fixv}_{1} \mathrm{v}_{2} \xrightarrow{\delta} \mathrm{v}_{1}\left(\mathrm{fix}_{1}\right) \mathrm{v}_{2} \tag{R-FIX} \end{equation*}(R-FIX)fixv1v2δv1(fix1)v2
fix is a fixpoint combinator: it effectively allows recursive definitions of functions. Indeed, the construct letrec f = λ f = λ f=lambdaf=\lambdaf=λ z. 1 1 _(1)_{1}1 in t 2 t 2 t_(2)t_{2}t2 provided by MLthe-programming-language may be viewed as syntactic sugar for let f = f = f=f=f= fix ( λ f . λ z . t 1 ) λ f . λ z . t 1 (lambda f.lambda z.t_(1))\left(\lambda f . \lambda z . t_{1}\right)(λf.λz.t1) in t 2 t 2 t_(2)t_{2}t2.
Rule R-CONTEXT completes the definition of the operational semantics by defining longrightarrow\longrightarrow, a relation between closed configurations, in terms of longrightarrow\longrightarrow. The rule states that reduction may take place not only at the term's root, but also deep inside it, provided the path from the root to the point where reduction occurs forms an evaluation context. This is how evaluation contexts determine an evaluation strategy. As a purely technical point, because longrightarrow\longrightarrow relates closed configurations only, we do not need to require that the memory locations in dom ( μ μ ) dom μ μ dom(mu^(')\\mu)\operatorname{dom}\left(\mu^{\prime} \backslash \mu\right)dom(μμ) be fresh for E E E\mathcal{E}E : indeed, every memory location that appears within E E E\mathcal{E}E must be a member of dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ).
1.2.11 Exercise [ [ [*********[\star \star \star[, Recommended, rarr\rightarrow a ] ] ]]] : Assuming the availability of Booleans and conditionals, integer literals, subtraction, multiplication, integer comparison, and a fixpoint combinator, most of which were defined in previous examples, define a function that computes the factorial of its integer argument, and apply it to 3 ^ 3 ^ hat(3)\hat{3}3^. Determine, step by step, how this expression reduces to a value.
It is straightforward to check that, if t / μ t / μ t//mut / \mut/μ reduces to t / μ t / μ t^(')//mu^(')t^{\prime} / \mu^{\prime}t/μ, then t t ttt is not a value. In other words, values are irreducible: they represent a completed computation. The proof is left as an exercise to the reader. The converse, however, does not hold: if t / μ t / μ t//mut / \mut/μ is irreducible with respect to longrightarrow\longrightarrow, then t t ttt is not necessarily a value. In that case, the configuration t / μ t / μ t//mut / \mut/μ is said to be stuck. It represents a runtime error, that is, a situation that does not allow computation to proceed, yet is not considered a valid outcome. A closed source program t t t\mathrm{t}t is said to go wrong if and only if the configuration t / t / t//O/\mathrm{t} / \varnothingt/ reduces to a stuck configuration.
1.2.12 ExAmple: Runtime errors typically arise when destructors are applied to arguments of an unexpected nature. For instance, the expressions + 1 m + 1 m +1m+1 \mathrm{~m}+1 m and π 1 2 π 1 2 pi_(1)quad2\pi_{1} \quad 2π12 and ! 3 ! 3 !3! 3!3 are stuck, regardless of the current store. The program let z = z = z=\mathbf{z}=z= + ^ + ^ + ^ + ^ hat(+) hat(+)\hat{+} \hat{+}+^+^ in z z zzz is not stuck, because + ^ + ^ + ^ + ^ hat(+) hat(+)\hat{+} \hat{+}+^+^ is a value. However, its reduct through R R R\mathrm{R}R-LET is + ^ + ^ + ^ + ^ hat(+) hat(+)\hat{+} \hat{+}+^+^, which is stuck, so this program goes wrong. The primary purpose of type systems is to prevent such situations from arising.
1.2.13 Example: The configuration ! m / μ ! m / μ !m//mu! m / \mu!m/μ is stuck if m m mmm is not in the domain of μ μ mu\muμ. In that case, however, ! m / μ ! m / μ !m//mu! m / \mu!m/μ is not closed. Because we consider longrightarrow\longrightarrow as a relation between closed configurations only, this situation cannot arise. In other
words, the semantics of ML-the-calculus never allows the creation of dangling pointers. As a result, no particular precautions need be taken to guard against them. Several strongly typed programming languages do nevertheless allow dangling pointers in a controlled fashion (Tofte and Talpin, 1997; Crary, Walker, and Morrisett, 1999b; DeLine and Fähndrich, 2001; Grossman, Morrisett, Jim, Hicks, Wang, and Cheney, 2002a).

Damas and Milner's type system

ML-the-type-system was originally defined by Milner (1978). Here, we reproduce the definition given a few years later by Damas and Milner (1982), which is written in a more standard style: typing judgements are defined inductively by a collection of typing rules. We refer to this type system as DM.
To begin, we must define types. In DM, like in the simply-typed λ λ lambda\lambdaλ-calculus, types are first-order terms built out of type constructors and type variables. We begin with several considerations concerning the specification of type constructors.
First, we do not wish to fix the set of type constructors. Certainly, since MLthe-calculus has functions, we need to be able to form an arrow type T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT out of arbitrary types T T T\mathrm{T}T and T T T^(')\mathrm{T}^{\prime}T; that is, we need a binary type constructor rarr\rightarrow. However, because ML-the-calculus includes an unspecified set of constants, we cannot say much else in general. If constants include integer literals and integer operations, as in Example 1.2.1, then a nullary type constructor int is needed; if they include pair construction and destruction, as in Examples 1.2.3 and 1.2.5, then a binary type constructor × × xx\times× is needed; and so on.
Second, it is common to refer to the parameters of a type constructor by position, that is, by numeric index. For instance, when one writes T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT, it is understood that the type constructor rarr\rightarrow has arity 2, that T T T\mathrm{T}T is its first parameter, known as its domain, and that T T T^(')\mathrm{T}^{\prime}T is its second parameter, known as its codomain. Here, however, we refer to parameters by names, known as directions. For instance, we define two directions domain and codomain and let the type constructor rarr\rightarrow have arity { { {\{{ domain, codomain } } }\}}. The extra generality afforded by directions is exploited in the definition of nonstructural subtyping (Example 1.3.9) and in the definition of rows (Section 1.11).
Last, we allow types to be classified using kinds. As a result, every type constructor must come not only with an arity, but with a richer signature, which describes the kinds of the types to which it is applicable and the kind of the type that it produces. A distinguished kind ***\star is associated with "normal" types, that is, types that are directly ascribed to expressions and values. For instance, the signature of the type constructor rarr\rightarrow is { { {\{{ domain |->***\mapsto \star, codomain } } |->***}=>***\mapsto \star\} \Rightarrow \star}, because it is applicable to two "normal" types and produces a "normal" type.
Introducing kinds other than ***\star allows viewing some terms as ill-formed types; this is illustrated, for instance, in Section 1.11. In the simplest case, however, ***\star is really the only kind, so the signature of a type constructor is nothing but its arity (a set of directions), and every term is a well-formed type, provided every application of a type constructor respects its arity.
1.2.14 Definition: Let d d ddd range over a finite or denumerable set of directions. Let κ κ kappa\kappaκ range over a finite or denumerable set of kinds. Let ***\star be a distinguished kind. Let K K KKK range over partial mappings from directions to kinds. Let F F FFF range over a finite or denumerable set of type constructors, each of which has a signature of the form K κ K κ K=>kappaK \Rightarrow \kappaKκ. The domain of K K KKK is referred to as the arity of F F FFF, while κ κ kappa\kappaκ is referred to as its image kind. We write κ κ kappa\kappaκ instead of K κ K κ K=>kappaK \Rightarrow \kappaKκ when K K KKK is empty. Let rarr\rightarrow be a type constructor of signature { { {\{{ domain |->***\mapsto \star, codomain } } |->***}=>***\mapsto \star\} \Rightarrow \star}.
The type constructors and their signatures collectively form a signature S S S\mathcal{S}S. In the following, we assume that a fixed signature S S S\mathcal{S}S is given and that every type constructor in it has finite arity, so as to ensure that types are machine representable. However, in Section 1.11, we shall explicitly work with several distinct signatures, some of which involve type constructors of denumerable arity.
A type variable is a name that is used to stand for a type. For simplicity, we assume that every type variable is branded with a kind, or, in other words, that type variables of distinct kinds are drawn from disjoint sets. Each of these sets of type variables is individually subject to α α alpha\alphaα-conversion: that is, renamings must preserve kinds. Attaching kinds to type variables is only a technical convenience: in practice, every operation performed during type inference preserves the property that every type is well-kinded, so it is not necessary to keep track of the kind of every type variable. It is only necessary to check that all types supplied by the user, within type declarations, type annotations, or module interfaces, are well-kinded.
1.2.15 Definition: For every kind κ κ kappa\kappaκ, let V κ V κ V_(kappa)\mathcal{V}_{\kappa}Vκ be a disjoint, denumerable set of type variables. Let X , Y X , Y X,Y\mathrm{X}, \mathrm{Y}X,Y, and Z Z Z\mathrm{Z}Z range over the set V V V\mathcal{V}V of all type variables. Let X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ and Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ range over finite sets of type variables. We write X ¯ Y ¯ X ¯ Y ¯ bar(X) bar(Y)\bar{X} \bar{Y}X¯Y¯ for the set X ¯ Y ¯ X ¯ Y ¯ bar(X)uu bar(Y)\bar{X} \cup \bar{Y}X¯Y¯ and often write X X X\mathrm{X}X for the singleton set { X } { X } {X}\{\mathrm{X}\}{X}. We write f t v ( o ) f t v ( o ) ftv(o)f t v(o)ftv(o) for the set of free type variables of an object o o ooo.
The set of types, ranged over by T T T\mathrm{T}T, is the free many-kinded term algebra that arises out of the type constructors and type variables.
1.2.16 Definition: A type of kind κ κ kappa\kappaκ is either a member of V κ V κ V_(kappa)\mathcal{V}_{\kappa}Vκ, or a term of the form F { d 1 T 1 , , d n T n } F d 1 T 1 , , d n T n F{d_(1)|->T_(1),dots,d_(n)|->T_(n)}F\left\{d_{1} \mapsto \mathrm{T}_{1}, \ldots, d_{n} \mapsto \mathrm{T}_{n}\right\}F{d1T1,,dnTn}, where F F FFF has signature { d 1 κ 1 , , d n κ n } d 1 κ 1 , , d n κ n {d_(1)|->kappa_(1),dots,d_(n)|->kappa_(n)}=>\left\{d_{1} \mapsto \kappa_{1}, \ldots, d_{n} \mapsto \kappa_{n}\right\} \Rightarrow{d1κ1,,dnκn} κ κ kappa\kappaκ and T 1 , , T n T 1 , , T n T_(1),dots,T_(n)\mathrm{T}_{1}, \ldots, \mathrm{T}_{n}T1,,Tn are types of kind κ 1 , , κ n κ 1 , , κ n kappa_(1),dots,kappa_(n)\kappa_{1}, \ldots, \kappa_{n}κ1,,κn, respectively.
As a notational convention, we assume that, for every type constructor F F FFF, the directions that form the arity of F F FFF are implicitly ordered, so that we may say that F F FFF has signature κ 1 κ n κ κ 1 κ n κ kappa_(1)ox dots oxkappa_(n)=>kappa\kappa_{1} \otimes \ldots \otimes \kappa_{n} \Rightarrow \kappaκ1κnκ and employ the syntax F T 1 T n F T 1 T n FT_(1)dotsT_(n)F \mathrm{~T}_{1} \ldots \mathrm{T}_{n}F T1Tn for applications of F F FFF. Applications of the type constructor rarr\rightarrow are written infix and associate to the right, so T T T T T T TrarrT^(')rarrT^('')\mathrm{T} \rightarrow \mathrm{T}^{\prime} \rightarrow \mathrm{T}^{\prime \prime}TTT stands for T ( T T ) T T T Trarr(T^(')rarrT^(''))\mathrm{T} \rightarrow\left(\mathrm{T}^{\prime} \rightarrow \mathrm{T}^{\prime \prime}\right)T(TT).
In order to give meaning to the free type variables of a type, or, more generally, of a typing judgement, traditional presentations of ML-the-typesystem, including Damas and Milner's, employ type substitutions. Most of our presentation avoids substitutions and uses constraints instead. However, we do need substitutions on a few occasions, especially when relating our presentation to Damas and Milner's.
1.2.17 Definition: A type substitution θ θ theta\thetaθ is a total, kind-preserving mapping of type variables to types that is the identity everywhere but on a finite subset of V V V\mathcal{V}V, which we call the domain of θ θ theta\thetaθ and write dom ( θ ) dom ( θ ) dom(theta)\operatorname{dom}(\theta)dom(θ). The range of θ θ theta\thetaθ, which we write range ( θ ) ( θ ) (theta)(\theta)(θ), is the set ftv ( θ ( dom ( θ ) ) ) ftv ( θ ( dom ( θ ) ) ) ftv(theta(dom(theta)))\operatorname{ftv}(\theta(\operatorname{dom}(\theta)))ftv(θ(dom(θ))). A type substitution may naturally be viewed as a total, kind-preserving mapping of types to types. In the following, we write X ¯ # θ X ¯ # θ bar(X)#theta\overline{\mathrm{X}} \# \thetaX¯#θ for X ¯ # ( dom ( θ ) X ¯ # ( dom ( θ ) bar(X)#(dom(theta)uu\overline{\mathrm{X}} \#(\operatorname{dom}(\theta) \cupX¯#(dom(θ) range ( θ ) ) ( θ ) ) (theta))(\theta))(θ)). We write θ X ¯ θ X ¯ theta\\ bar(X)\theta \backslash \overline{\mathrm{X}}θX¯ for the restriction of θ θ theta\thetaθ outside X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, that is, the restriction of θ θ theta\thetaθ to V X ¯ V X ¯ V\\ bar(X)\mathcal{V} \backslash \overline{\mathrm{X}}VX¯. We sometimes let φ φ varphi\varphiφ denote a type substitution.
If X X vec(X)\vec{X}X and T T vec(T)\vec{T}T are respectively a vector of distinct type variables and a vector of types of the same (finite) length, such that, for every index i , x i i , x i i,x_(i)i, \mathrm{x}_{i}i,xi and T i T i T_(i)\mathrm{T}_{i}Ti have the same kind, then [ X T ] [ X T ] [ vec(X)|-> vec(T)][\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}][XT] denotes the substitution that maps X i X i X_(i)\mathrm{X}_{i}Xi to T i T i T_(i)\mathrm{T}_{i}Ti for every index i i iii. The domain of [ X T ] [ X T ] [ vec(X)|-> vec(T)][\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}][XT] is a subset of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, the set underlying the vector X X vec(X)\overrightarrow{\mathrm{X}}X. Its range is a subset of f t v ( T ¯ ) f t v ( T ¯ ) ftv( bar(T))f t v(\overline{\mathrm{T}})ftv(T¯), where T ¯ T ¯ bar(T)\overline{\mathrm{T}}T¯ is the set underlying the vector T T vec(T)\vec{T}T. Every substitution θ θ theta\thetaθ may be written under the form [ X T ] [ X T ] [ vec(X)|-> vec(T)][\vec{X} \mapsto \vec{T}][XT], where X ¯ = dom ( θ ) X ¯ = dom ( θ ) bar(X)=dom(theta)\overline{\mathrm{X}}=\operatorname{dom}(\theta)X¯=dom(θ). Then, θ θ theta\thetaθ is idempotent if and only if X ¯ # f t v ( T ¯ ) X ¯ # f t v ( T ¯ ) bar(X)#ftv( bar(T))\overline{\mathrm{X}} \# \mathrm{ftv}(\overline{\mathrm{T}})X¯#ftv(T¯) holds.
As pointed out earlier, types are first-order terms; that is, in the grammar of types, none of the productions binds a type variable. As a result, every type variable that appears within a type T T T\mathrm{T}T appears free within T T T\mathrm{T}T. This situation is identical to that of the simply-typed λ λ lambda\lambdaλ-calculus. Things become more interesting when we introduce type schemes. As its name implies, a type scheme may describe an entire family of types; this effect is achieved via universal quantification over a set of type variables.
1.2.18 Definition: A type scheme S S S\mathrm{S}S is an object of the form X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯. T T T\mathrm{T}T, where T T T\mathrm{T}T is a type of kind ***\star and the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ are considered bound within T T T\mathrm{T}T.
One may view the type T T T\mathrm{T}T as the trivial type scheme AA O/\forall \varnothing. T T T\mathrm{T}T, where no type variables are universally quantified, so types may be viewed as a subset of type schemes. The type scheme X ¯ . T X ¯ . T AA bar(X).T\forall \bar{X} . TX¯.T may be viewed as a finite way of describing the possibly infinite family of types of the form [ X T ] T [ X T ] T [ vec(X)|-> vec(T)]T[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}[XT]T, where T T vec(T)\overrightarrow{\mathrm{T}}T is arbitrary.
Γ ( x ) = S Γ ( x ) = S Gamma(x)=S\Gamma(\mathrm{x})=\mathrm{S}Γ(x)=S ( D M V A R ) ( D M V A R ) (DM-VAR)(\mathrm{DM}-\mathrm{VAR})(DMVAR) Γ t 1 : s Γ t 1 : s Gamma|--t_(1):s\Gamma \vdash \mathrm{t}_{1}: \mathrm{s}Γt1:s Γ ; z : S t 2 : T Γ ; z : S t 2 : T Gamma;z:S|--t_(2):T\Gamma ; \mathrm{z}: \mathrm{S} \vdash \mathrm{t}_{2}: \mathrm{T}Γ;z:St2:T (DM-LET)
Γ x : S ¯ Γ x : S ¯ bar(Gamma|--x:S)\overline{\Gamma \vdash \mathrm{x}: \mathrm{S}}Γx:S¯ Γ let z = t 1 in t 2 : T ¯ Γ  let  z = t 1  in  t 2 : T ¯ bar(Gamma|--" let "z=t_(1)" in "t_(2):T)\overline{\Gamma \vdash \text { let } z=t_{1} \text { in } t_{2}: T}Γ let z=t1 in t2:T¯
Γ ; z : T t : T Γ ; z : T t : T Gamma;z:T|--t:T^(')\Gamma ; \mathrm{z}: \mathrm{T} \vdash \mathrm{t}: \mathrm{T}^{\prime}Γ;z:Tt:T ( D M A B S ) ( D M A B S ) (DM-ABS)(\mathrm{DM}-\mathrm{ABS})(DMABS) Γ t : T Γ t : T Gamma|--t:T\Gamma \vdash \mathrm{t}: \mathrm{T}Γt:T x ¯ # f t v ( Γ ) x ¯ # f t v ( Γ ) bar(x)#ftv(Gamma)\overline{\mathrm{x}} \# \mathrm{ftv}(\Gamma)x¯#ftv(Γ) (DM-GEN)
Γ λ z . t : T T ¯ Γ λ z . t : T T ¯ bar(Gamma|--lambda z.t:TrarrT^('))\overline{\Gamma \vdash \lambda z . t: \mathrm{T} \rightarrow \mathrm{T}^{\prime}}Γλz.t:TT¯ {
Γ t : X . T Γ t : X . T Gamma|--t:AAX.T\Gamma \vdash \mathrm{t}: \forall \mathrm{X} . \mathrm{T}Γt:X.T
Γ t : X ¯ . T Γ t : X ¯ . T Gamma|--t:AA bar(X).T\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T}Γt:X¯.T
Γ t : [ X T ] T Γ t : [ X T ] T Gamma|--t:[ vec(X)|-> vec(T)]T\Gamma \vdash \mathrm{t}:[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}Γt:[XT]T
Gamma|--t:AAX.T Gamma|--t:AA bar(X).T Gamma|--t:[ vec(X)|-> vec(T)]T| $\Gamma \vdash \mathrm{t}: \forall \mathrm{X} . \mathrm{T}$ | | :---: | | $\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T}$ | | $\Gamma \vdash \mathrm{t}:[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}$ |
}
(DM-APP) (DM-INST)
Γ t 1 t 2 : T Γ t 1 t 2 : T Gamma|--t_(1)t_(2):T^(')\Gamma \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}Γt1t2:T
Gamma(x)=S (DM-VAR) Gamma|--t_(1):s Gamma;z:S|--t_(2):T (DM-LET) bar(Gamma|--x:S) bar(Gamma|--" let "z=t_(1)" in "t_(2):T) Gamma;z:T|--t:T^(') (DM-ABS) Gamma|--t:T bar(x)#ftv(Gamma) (DM-GEN) bar(Gamma|--lambda z.t:TrarrT^(')) {"Gamma|--t:AAX.T Gamma|--t:AA bar(X).T Gamma|--t:[ vec(X)|-> vec(T)]T"} https://cdn.mathpix.com/cropped/2024_03_11_24264b834bbd98369519g-015.jpg?height=73&width=453&top_left_y=490&top_left_x=312 (DM-APP) (DM-INST) Gamma|--t_(1)t_(2):T^(') | $\Gamma(\mathrm{x})=\mathrm{S}$ | $(\mathrm{DM}-\mathrm{VAR})$ | $\Gamma \vdash \mathrm{t}_{1}: \mathrm{s}$ | $\Gamma ; \mathrm{z}: \mathrm{S} \vdash \mathrm{t}_{2}: \mathrm{T}$ | (DM-LET) | | :---: | :---: | :---: | :---: | :---: | | $\overline{\Gamma \vdash \mathrm{x}: \mathrm{S}}$ | | $\overline{\Gamma \vdash \text { let } z=t_{1} \text { in } t_{2}: T}$ | | | | $\Gamma ; \mathrm{z}: \mathrm{T} \vdash \mathrm{t}: \mathrm{T}^{\prime}$ | $(\mathrm{DM}-\mathrm{ABS})$ | $\Gamma \vdash \mathrm{t}: \mathrm{T}$ | $\overline{\mathrm{x}} \# \mathrm{ftv}(\Gamma)$ | (DM-GEN) | | $\overline{\Gamma \vdash \lambda z . t: \mathrm{T} \rightarrow \mathrm{T}^{\prime}}$ | | {$\Gamma \vdash \mathrm{t}: \forall \mathrm{X} . \mathrm{T}$ <br> $\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T}$ <br> $\Gamma \vdash \mathrm{t}:[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}$} | | | | ![](https://cdn.mathpix.com/cropped/2024_03_11_24264b834bbd98369519g-015.jpg?height=73&width=453&top_left_y=490&top_left_x=312) | (DM-APP) | | | (DM-INST) | | $\Gamma \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}$ | | | | |
Figure 1-3: Typing rules for DM
Such types are called instances of the type scheme X ¯ X ¯ AA bar(X)\forall \bar{X}X¯.T. Note that, throughout most of this chapter, we work with constrained type schemes, a generalization of DM type schemes (Definition 1.3.2).
Typing environments, or environments for short, are used to collect assumptions about an expression's free identifiers.
1.2.19 Definition: An environment Γ Γ Gamma\GammaΓ is a finite ordered sequence of pairs of a program identifier and a type scheme. We write O/\varnothing for the empty environment and ; for the concatenation of environments. An environment may be viewed as a finite mapping from program identifiers to type schemes by letting Γ ( x ) = S Γ ( x ) = S Gamma(x)=S\Gamma(\mathrm{x})=\mathrm{S}Γ(x)=S if and only if Γ Γ Gamma\GammaΓ is of the form Γ 1 ; x : S ; Γ 2 Γ 1 ; x : S ; Γ 2 Gamma_(1);x:S;Gamma_(2)\Gamma_{1} ; \mathrm{x}: \mathrm{S} ; \Gamma_{2}Γ1;x:S;Γ2, where Γ 2 Γ 2 Gamma_(2)\Gamma_{2}Γ2 contains no assumption about x x x\mathrm{x}x. The set of defined program identifiers of an environment Γ Γ Gamma\GammaΓ, written d p i ( Γ ) d p i ( Γ ) dpi(Gamma)d p i(\Gamma)dpi(Γ), is defined by d p i ( ) = d p i ( ) = dpi(O/)=O/d p i(\varnothing)=\varnothingdpi()= and d p i ( Γ ; x : S ) = d p i ( Γ ) { x } d p i ( Γ ; x : S ) = d p i ( Γ ) { x } dpi(Gamma;x:S)=dpi(Gamma)uu{x}d p i(\Gamma ; \mathrm{x}: \mathrm{S})=d p i(\Gamma) \cup\{\mathrm{x}\}dpi(Γ;x:S)=dpi(Γ){x}.
To complete the definition of Damas and Milner's type system, there remains to define typing judgements. A typing judgement takes the form Γ t : S Γ t : S Gamma|--t:S\Gamma \vdash \mathrm{t}: \mathrm{S}Γt:S, where t t t\mathrm{t}t is an expression of interest, Γ Γ Gamma\GammaΓ is an environment, which typically contains assumptions about t's free program identifiers, and S S S\mathrm{S}S is a type scheme. Such a judgement may be read: under assumptions Γ Γ Gamma\GammaΓ, the expression t t t\mathrm{t}t has the type scheme S S S\mathrm{S}S. By abuse of language, it is sometimes said that t t t\mathrm{t}t has type S S S\mathrm{S}S. A typing judgement is valid (or holds) if and only if it may be derived using the rules that appear in Figure 1-3. An expression t is well-typed within the environment Γ Γ Gamma\GammaΓ if and only if some judgement of the form Γ t : S Γ t : S Gamma|--t:S\Gamma \vdash t: SΓt:S holds; it is ill-typed within Γ Γ Gamma\GammaΓ otherwise.
Rule DM-VAR allows fetching a type scheme for an identifier x x x\mathrm{x}x from the environment. It is equally applicable to program variables, memory locations, and constants. If no assumption concerning x x x\mathrm{x}x appears in the environment Γ Γ Gamma\GammaΓ, then the rule isn't applicable. In that case, the expression x x xxx is ill-typed within Γ Γ Gamma\GammaΓ - can you prove it? Assumptions about constants are usually collected in
a so-called initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0. It is the environment under which closed programs are typechecked, so every subexpression is typechecked under some extension Γ Γ Gamma\GammaΓ of Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0. Of course, the type schemes assigned by Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 to constants must be consistent with their operational semantics; we say more about this later (Section 1.7). Rule DM-ABS specifies how to typecheck a λ λ lambda\lambdaλ-abstraction λ λ lambda\lambdaλ z.t. Its premise requires the body of the function, namely t t ttt, to be well-typed under an extra assumption, which causes all free occurrences of z z zzz within t t ttt to receive a common type T T T\mathrm{T}T. Its conclusion forms the arrow type T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT out of the types of the function's formal parameter, namely T T T\mathrm{T}T, and result, namely T T T^(')\mathrm{T}^{\prime}T. It is worth noting that this rule always augments the environment with a type T T T\mathrm{T}T - recall that, by convention, types form a subset of type schemesbut never with a nontrivial type scheme. DM-APP states that the type of a function application is the codomain of the function's type, provided that the domain of the function's type is a valid type for the actual argument. DMLET closely mirrors the operational semantics: whereas the semantics of the local definition let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 t 2 t_(2)\mathrm{t}_{2}t2 is to augment the runtime environment by binding z z z\mathrm{z}z to the value of t 1 t 1 t_(1)\mathrm{t}_{1}t1 prior to evaluating t 2 t 2 t_(2)\mathrm{t}_{2}t2, the effect of DM-LET is to augment the typing environment by binding z z z\mathrm{z}z to a type scheme for t 1 t 1 t_(1)t_{1}t1 prior to typechecking t 2 t 2 t_(2)t_{2}t2. DM-GEN turns a type into a type scheme by universally quantifying over a set of type variables that do not appear free in the environment; this restriction is discussed in Example 1.2.20 below. DMINST, on the contrary, turns a type scheme into one of its instances, which may be chosen arbitrarily. These two operations are referred to as generalization and instantiation. The notion of type scheme and the rules DM-GEN and DMINsT are characteristic of ML-the-type-system: they distinguish it from the simply-typed λ λ lambda\lambdaλ-calculus.
1.2.20 EXAmPLE: It is unsound to allow generalizing type variables that appear free in the environment. For instance, consider the typing judgement z : X z z : X z z:X|--z\mathbf{z}: \mathrm{X} \vdash \mathbf{z}z:Xz : X (1), which, according to DM-VAR, is valid. Applying an unrestricted version of DM-GEN to it, we obtain z : X z : X . X z : X z : X . X z:X|--z:AAX.X\mathrm{z}: \mathrm{X} \vdash \mathrm{z}: \forall \mathrm{X} . \mathrm{X}z:Xz:X.X (2), whence, by DM-INST, z : X |--\vdash z : Y (3). By DM-ABS and DM-GEN, we then have λ z . z : X Y . X Y λ z . z : X Y . X Y O/|--lambda z.z:AA XY.X rarr Y\varnothing \vdash \lambda z . z: \forall X Y . X \rightarrow Yλz.z:XY.XY. In other words, the identity function has unrelated argument and result types! Then, the expression ( λ z . z ) 0 ^ 0 ^ ( λ z . z ) 0 ^ 0 ^ (lambda z.z) hat(0) hat(0)(\lambda z . z) \hat{0} \hat{0}(λz.z)0^0^, which reduces to the stuck expression 0 ^ 0 ^ 0 ^ 0 ^ hat(0) hat(0)\hat{0} \hat{0}0^0^, has type scheme z . z z . z AA z.z\forall z . zz.z. So, well-typed programs may cause runtime errors: the type system is unsound.
What happened? It is clear that the judgement (1) is correct only because the type assigned to z z z\mathrm{z}z is the same in its assumption and in its right-hand side. For the same reason, the judgements (2) and (3)-the former of which may be written z : X z : Y . Y z : X z : Y . Y z:X|--z:AA Y.Yz: X \vdash z: \forall Y . Yz:Xz:Y.Y-are incorrect. Indeed, such judgements defeat the very purpose of environments, since they disregard their assumption.
By universally quantifying over x x x\mathrm{x}x in the right-hand side only, we break the connection between occurrences of X X X\mathrm{X}X in the assumption, which remain free, and occurrences in the right-hand side, which become bound. This is correct only if there are in fact no free occurrences of X X X\mathrm{X}X in the assumption.
It is a key feature of ML-the-type-system that DM-ABS may only introduce a type T T T\mathrm{T}T, rather than a type scheme, into the environment. Indeed, this allows the rule's conclusion to form the arrow type T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT. If instead the rule were to introduce the assumption z : S z : S z:S\mathrm{z}: \mathrm{S}z:S into the environment, then its conclusion would have to form S T S T SrarrT^(')\mathrm{S} \rightarrow \mathrm{T}^{\prime}ST, which is not a well-formed type. In other words, this restriction is necessary to preserve the stratification between types and type schemes. If we were to remove this stratification, thus allowing universal quantifiers to appear deep inside types, we would obtain an implicitly-typed version of System F (TAPL Chapter 23). Type inference for System F is undecidable (Wells, 1999), while type inference for ML-the-type-system is decidable, as we show later, so this design choice has a rather drastic impact.
1.2.21 Exercise [ ***\star, Recommended]: Build a type derivation for the expression λ z 1 λ z 1 lambdaz_(1)\lambda z_{1}λz1. let z 2 = z 1 z 2 = z 1 z_(2)=z_(1)z_{2}=z_{1}z2=z1 in z 2 z 2 z_(2)z_{2}z2 within DM.
1.2.22 Exercise [ [ [***[\star[, Recommended]: Let int be a nullary type constructor of signature ***\star. Let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 consist of the bindings + ^ : + ^ : hat(+):\hat{+}:+^: int rarr\rightarrow int rarr\rightarrow int and k ^ : k ^ : hat(k):\hat{k}:k^: int, for every integer k k kkk. Can you find derivations of the following valid typing judgements? Which of these judgements are valid in the simply-typed λ λ lambda\lambdaλ-calculus, where let z = t 1 z = t 1 z=t_(1)z=t_{1}z=t1 in t 2 t 2 t_(2)t_{2}t2 is syntactic sugar for ( λ z . t 2 ) t 1 λ z . t 2 t 1 (lambda z.t_(2))t_(1)\left(\lambda z . t_{2}\right) t_{1}(λz.t2)t1 ?
Γ 0 λ z . z : int int Γ 0 λ z . z : x . x x Γ 0 et f = λ z . z + ^ 1 ^ in f 2 ^ : int Γ 0 let f = λ z . z inf f 2 ^ : int Γ 0 λ z . z : int int Γ 0 λ z . z : x . x x Γ 0  et  f = λ z . z + ^ 1 ^  in  f 2 ^ : int Γ 0  let  f = λ z . z  inf  f 2 ^ : int {:[Gamma_(0)|--lambda z.z:int rarr int],[Gamma_(0)|--lambda z.z:AA x.x rarr x],[Gamma_(0)|--" et "f=lambda z.z hat(+) hat(1)" in "f hat(2):int],[Gamma_(0)|--" let "f=lambda z.z" inf "f hat(2):int]:}\begin{gathered} \Gamma_{0} \vdash \lambda z . z: \operatorname{int} \rightarrow \operatorname{int} \\ \Gamma_{0} \vdash \lambda z . z: \forall x . x \rightarrow x \\ \Gamma_{0} \vdash \text { et } f=\lambda z . z \hat{+} \hat{1} \text { in } f \hat{2}: \operatorname{int} \\ \Gamma_{0} \vdash \text { let } f=\lambda z . z \text { inf } f \hat{2}: \operatorname{int} \end{gathered}Γ0λz.z:intintΓ0λz.z:x.xxΓ0 et f=λz.z+^1^ in f2^:intΓ0 let f=λz.z inf f2^:int
Show that the expressions 1 ^ 2 ^ 1 ^ 2 ^ hat(1) hat(2)\hat{1} \hat{2}1^2^ and λ f λ f lambda f\lambda fλf.(f f f fff ) are ill-typed within Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0. Could these expressions be well-typed in a more powerful type system?
1.2.23 EXERCISE [ ] [ ] [*********][\star \star \star][] : In fact, the rules shown in Figure 1-3 are not exactly Damas and Milner's original rules. In (Damas and Milner, 1982), the generalization and instantiation rules are:
(DM-GEN') Γ t : S X f t v ( Γ ) Γ t : X . S (DM-INST') Γ t : X ¯ . T Y ¯ # f t v ( X ¯ . T ) Γ t : Y ¯ . [ X T ] T (DM-GEN') Γ t : S X f t v ( Γ ) Γ t : X . S (DM-INST') Γ t : X ¯ . T Y ¯ # f t v ( X ¯ . T ) Γ t : Y ¯ . [ X T ] T {:[(DM-GEN')(Gamma|--t:SquadX!in ftv(Gamma))/(Gamma|--t:AAX.S)],[(DM-INST')(Gamma|--t:AA bar(X).Tquad bar(Y)#ftv(AA bar(X).T))/(Gamma|--t:AA bar(Y).[ vec(X)|-> vec(T)]T)]:}\begin{gather*} \frac{\Gamma \vdash \mathrm{t}: \mathrm{S} \quad \mathrm{X} \notin f t v(\Gamma)}{\Gamma \vdash \mathrm{t}: \forall \mathrm{X} . \mathrm{S}} \tag{DM-GEN'}\\ \frac{\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T} \quad \overline{\mathrm{Y}} \# f t v(\forall \overline{\mathrm{X}} . \mathrm{T})}{\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{Y}} .[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}} \tag{DM-INST'} \end{gather*}(DM-GEN')Γt:SXftv(Γ)Γt:X.S(DM-INST')Γt:X¯.TY¯#ftv(X¯.T)Γt:Y¯.[XT]T
where AA\forall X.S stands for X X ¯ X X ¯ AA X bar(X)\forall X \bar{X}XX¯.T if S S SSS stands for X ¯ X ¯ AA bar(X)\forall \bar{X}X¯.T. Show that the combination of DM-GEN' and DM-INST' is equivalent to the combination of DM-GEN and DM-INST.
DM enjoys a number of nice theoretical properties, which have practical implications. First, under suitable hypotheses about the semantics of constants, about the type schemes that they receive in the initial environment, and - in the presence of side effects - under a slight restriction of the syntax of let constructs, it is possible to show that the type system is sound: that is, well-typed (closed) programs do not go wrong. This essential property ensures that programs that are accepted by the typechecker may be compiled without runtime checks. Furthermore, it is possible to show that there exists an algorithm that, given a (closed) environment Γ Γ Gamma\GammaΓ and a program t t ttt, tells whether t t ttt is well-typed with respect to Γ Γ Gamma\GammaΓ, and if so, produces a principal type scheme S S S\mathrm{S}S. A principal type scheme is such that (i) it is valid, that is, Γ t : S Γ t : S Gamma|--t:S\Gamma \vdash \mathrm{t}: \mathrm{S}Γt:S holds, and (ii) it is most general, that is, every judgement of the form Γ t : S Γ t : S Gamma|--t:S^(')\Gamma \vdash t: S^{\prime}Γt:S follows from Γ t : S Γ t : S Gamma|--t:S\Gamma \vdash \mathrm{t}: \mathrm{S}Γt:S by DM-InsT and DM-GEN. (For the sake of simplicity, we have stated the properties of the type inference algorithm only in the case of a closed environment Γ Γ Gamma\GammaΓ; the specification is slightly heavier in the general case.) This implies that type inference is decidable: the compiler does not require expressions to be annotated with types. It also implies that, under a fixed environment Γ Γ Gamma\GammaΓ, all of the type information associated with an expression t t ttt may be summarized in the form of a single (principal) type scheme, which is very convenient.

Road map

Before proving the above claims, we first generalize our presentation by moving to a constraint-based setting. The necessary tools, namely the constraint language, its interpretation, and a number of constraint equivalence laws, are introduced in Section 1.3. In Section 1.4, we describe the standard constraintbased type system HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) (Odersky, Sulzmann, and Wehr, 1999a; Sulzmann, Müller, and Zenger, 1999; Sulzmann, 2000). We prove that, when constraints are made up of equations between free, finite terms, HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) is a reformulation of DM. In the presence of a more powerful constraint language, HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) is an extension of DM. In Section 1.5, we propose an original reformulation of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), dubbed PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X), whose distinctive feature is to exploit type scheme introduction and instantiation constraints. In Section 1.6, we show that, thanks to the extra expressive power afforded by these constraint forms, type inference may be viewed as a combination of constraint generation and constraint solving, as promised earlier. Indeed, we define a constraint generator and relate it with PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X). Then, in Section 1.7, we give a type soundness
theorem. It is stated purely in terms of constraints, but-thanks to the results developed in the previous sections - applies equally to PCB ( X ) , HM ( X ) PCB ( X ) , HM ( X ) PCB(X),HM(X)\operatorname{PCB}(X), \operatorname{HM}(X)PCB(X),HM(X), and DM.
Throughout this core material, the syntax and interpretation of constraints are left partly unspecified. Thus, the development is parameterized with respect to them-hence the unknown X X XXX in the names HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) and PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X). We really describe a family of constraint-based type systems, all of which share a common constraint generator and a common type soundness proof. Constraint solving, however, cannot be independent of X X XXX : on the contrary, the design of an efficient solver is heavily dependent on the syntax and interpretation of constraints. In Section 1.8, we consider constraint solving in the particular case where constraints are made up of equations interpreted in a free tree model, and define a constraint solver on top of a standard first-order unification algorithm.
The remainder of this chapter deals with extensions of the framework. In Section 1.9, we explain how to extend ML-the-calculus with a number of features, including data structures, pattern matching, and type annotations. In Section 1.10, we extend the constraint language with universal quantification and describe a number of extra features that require this extension, including a different flavor of type annotations, polymorphic recursion, and first-class universal and existential types. Last, in Section 1.11, we extend the constraint language with rows and describe their applications, which include extensible variants and records.

1.3 Constraints

In this section, we define the syntax and logical meaning of constraints. Both are partly unspecified. Indeed, the set of type constructors (Definition 1.2.14) must contain at least the binary type constructor rarr\rightarrow, but might contain more. Similarly, the syntax of constraints involves a set of so-called predicates on types, which we require to contain at least a binary subtyping predicate <=\leq, but might contain more. Furthermore, the logical interpretation of type constructors and of predicates is left almost entirely unspecified. This freedom allows reasoning not only about Damas and Milner's type system, but also about a family of constraint-based extensions of it.
Type constructors other than rarr\rightarrow and predicates other than <=\leq will never explicitly appear in the definition of our constraint-based type systems, precisely because the definition is parametric with respect to them. They can (and usually do) appear in the type schemes assigned to constructors and destructors by the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0.
The introduction of subtyping has little impact on the complexity of our
Figure 1-4: Syntax of type schemes and constraints
proofs, yet increases the framework's expressive power. When subtyping is not desired, we interpret the predicate <=\leq as equality.

Syntax

We now define the syntax of constrained type schemes and of constraints, and introduce some extra constraint forms as syntactic sugar.
1.3.1 Definition: Let P P PPP range over a finite or denumerable set of predicates, each of which has a signature of the form κ 1 κ n κ 1 κ n kappa_(1)ox dots oxkappa_(n)=>*\kappa_{1} \otimes \ldots \otimes \kappa_{n} \Rightarrow \cdotκ1κn, where n 0 n 0 n >= 0n \geq 0n0. Let <=\leq be a distinguished predicate of signature ***ox***=>∙\star \otimes \star \Rightarrow \bullet
1.3.2 Definition: The syntax of type schemes and constraints is given in Figure 14. It is further restricted by the following requirements. In the type scheme X ¯ [ C ] . T X ¯ [ C ] . T AA bar(X)[C].T\forall \overline{\mathrm{X}}[C] . \mathrm{T}X¯[C].T and in the constraint x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT, the type T T T\mathrm{T}T must have kind ***\star. In the constraint P T 1 T n P T 1 T n PT_(1)dotsT_(n)P \mathrm{~T}_{1} \ldots \mathrm{T}_{n}P T1Tn, the types T 1 , , T n T 1 , , T n T_(1),dots,T_(n)\mathrm{T}_{1}, \ldots, \mathrm{T}_{n}T1,,Tn must have kind κ 1 , , κ n κ 1 , , κ n kappa_(1),dots,kappa_(n)\kappa_{1}, \ldots, \kappa_{n}κ1,,κn, respectively, if P P PPP has signature κ 1 κ n κ 1 κ n kappa_(1)ox dots oxkappa_(n)=>*\kappa_{1} \otimes \ldots \otimes \kappa_{n} \Rightarrow \cdotκ1κn. We write X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯.T for x ¯ [ x ¯ [ AA bar(x)[\forall \overline{\mathrm{x}}[x¯[ true].T, which allows viewing DM type schemes as a subset of constrained type schemes.
We write T 1 T 2 T 1 T 2 T_(1) <= T_(2)\mathrm{T}_{1} \leq \mathrm{T}_{2}T1T2 for the binary predicate application T 1 T 2 T 1 T 2 <= T_(1)T_(2)\leq \mathrm{T}_{1} \mathrm{~T}_{2}T1 T2, and call it a subtyping constraint. By convention, EE\exists and def bind tighter than ^^\wedge; that is, x ¯ . C D x ¯ . C D EE bar(x).C^^D\exists \overline{\mathrm{x}} . C \wedge Dx¯.CD is ( x ¯ . C ) D ( x ¯ . C ) D (EE bar(x).C)^^D(\exists \overline{\mathrm{x}} . C) \wedge D(x¯.C)D and def x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C D C D C^^DC \wedge DCD is ( def x : σ ( def x : σ (def x:sigma(\operatorname{def} \mathrm{x}: \sigma(defx:σ in C ) D C ) D C)^^DC) \wedge DC)D. In x ¯ [ C ] . T x ¯ [ C ] . T AA bar(x)[C].T\forall \overline{\mathrm{x}}[C] . \mathrm{T}x¯[C].T, the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ are bound within C C CCC and T T T\mathrm{T}T. In X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. C C CCC, the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ are bound within C C CCC. The sets of free type variables of a type scheme σ σ sigma\sigmaσ and of a constraint C C CCC, written f t v ( σ ) f t v ( σ ) ftv(sigma)f t v(\sigma)ftv(σ) and f t v ( C ) f t v ( C ) ftv(C)f t v(C)ftv(C), respectively, are defined accordingly. In def x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC, the identifier x x x\mathrm{x}x is bound within C C CCC. The sets
of free program identifiers of a type scheme σ σ sigma\sigmaσ and of a constraint C C CCC, written f p i ( σ ) f p i ( σ ) fpi(sigma)f p i(\sigma)fpi(σ) and f p i ( C ) f p i ( C ) fpi(C)f p i(C)fpi(C), respectively, are defined accordingly. Please note that x x x\mathrm{x}x occurs free in the constraint x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT.
We immediately introduce a number of derived constraint forms:
1.3.3 Definition: Let σ σ sigma\sigmaσ be X ¯ [ C ] X ¯ [ C ] AA bar(X)[C]\forall \overline{\mathrm{X}}[C]X¯[C].T. If X ¯ # ftv ( T ) X ¯ # ftv T bar(X)#ftv(T^('))\overline{\mathrm{X}} \# \operatorname{ftv}\left(\mathrm{T}^{\prime}\right)X¯#ftv(T) holds, then σ T σ T sigma-<=T^(')\sigma \preceq \mathrm{T}^{\prime}σT (read: T T T^(')\mathrm{T}^{\prime}T is an instance of σ ) σ ) sigma)\sigma)σ) stands for the constraint X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. ( C T T ) C T T (C^^T <= T^('))\left(C \wedge \mathrm{T} \leq \mathrm{T}^{\prime}\right)(CTT). We write σ σ EE sigma\exists \sigmaσ (read: σ σ sigma\sigmaσ has an instance) for x ¯ . C x ¯ . C EE bar(x).C\exists \overline{\mathrm{x}} . Cx¯.C and let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC for σ def x : σ σ def x : σ EE sigma^^def x:sigma\exists \sigma \wedge \operatorname{def} \mathrm{x}: \sigmaσdefx:σ in C C CCC.
Constrained type schemes generalize Damas and Milner's type schemes, while our definition of instantiation constraints generalizes Damas and Milner's instance relation (Definition 1.2.18). Let us draw a comparison. First, Damas and Milner's instance relation yields a "yes/no" answer, and is purely syntactic: for instance, the type Y Z Y Z YrarrZ\mathrm{Y} \rightarrow \mathrm{Z}YZ is not an instance of X X AAX\forall \mathrm{X}X. X X X X XrarrX\mathrm{X} \rightarrow \mathrm{X}XX in Damas and Milner's sense, because Y Y Y\mathrm{Y}Y and Z Z Z\mathrm{Z}Z are distinct type variables. In our presentation, on the other hand, X . X X Y Z X . X X Y Z AAX.XrarrX-<=YrarrZ\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X} \preceq \mathrm{Y} \rightarrow \mathrm{Z}X.XXYZ is not an assertion; rather, it is a constraint, which by definition is X X EEX\exists \mathrm{X}X. (true X X Y Z X X Y Z ^^XrarrX <= YrarrZ\wedge \mathrm{X} \rightarrow \mathrm{X} \leq \mathrm{Y} \rightarrow \mathrm{Z}XXYZ ). We later prove that it is equivalent to X X EEX\exists \mathrm{X}X. ( Y X X Z ) ( Y X X Z ) (Y <= X^^X <= Z)(\mathrm{Y} \leq \mathrm{X} \wedge \mathrm{X} \leq \mathrm{Z})(YXXZ) and to Y Z Y Z Y <= Z\mathrm{Y} \leq \mathrm{Z}YZ, or, if subtyping is interpreted as equality, to Y = Z Y = Z Y=Z\mathrm{Y}=\mathrm{Z}Y=Z. That is, σ T σ T sigma-<=T^(')\sigma \preceq \mathrm{T}^{\prime}σT represents a condition on (the types denoted by) the type variables in f t v ( σ , T ) f t v σ , T ftv(sigma,T^('))f t v\left(\sigma, \mathrm{T}^{\prime}\right)ftv(σ,T) for T T T^(')\mathrm{T}^{\prime}T to be an instance of σ σ sigma\sigmaσ, in a logical, rather than purely syntactic, sense. Second, the definition of instantiation constraints involves subtyping, so as to ensure that any supertype of an instance of σ σ sigma\sigmaσ is again an instance of σ σ sigma\sigmaσ (see rule C-ExTrans in Figure 1-6 and Lemma 1.3.17). This is consistent with the purpose of subtyping, which is to allow supplying a subtype where a supertype is expected (TAPL Chapter 15). Third and last, every type scheme now carries a constraint. The constraint C C CCC, whose free type variables may or may not be members of x ¯ x ¯ bar(x)\overline{\mathrm{x}}x¯, restricts the instances of the type scheme x ¯ [ C ] x ¯ [ C ] AA bar(x)[C]\forall \overline{\mathrm{x}}[C]x¯[C].T. This is expressed in the instantiation constraint X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. ( C T T ) C T T (C^^T <= T^('))\left(C \wedge \mathrm{T} \leq \mathrm{T}^{\prime}\right)(CTT), where the values that X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ may assume are restricted by the requirement that C C CCC be satisfied. This requirement vanishes in the case of DM type schemes, where C C CCC is true. Our notions of constrained type scheme and of instantiation constraint are standard: they are exactly those of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) (Odersky, Sulzmann, and Wehr, 1999a).
The constraint true, which is always satisfied, mainly serves to indicate the absence of a nontrivial constraint, while false, which has no solution, may be understood as the indication of a type error. Composite constraints include conjunction and existential quantification, which have their standard meaning, as well as type scheme introduction and type scheme instantiation constraints, which are similar to Gustavsson and Svenningsson's constraint abstractions (2001b). In short, the construct def x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC binds the name x x x\mathrm{x}x to the type scheme σ σ sigma\sigmaσ within the constraint C C CCC. If C C CCC contains a subconstraint of
the form x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT, where this occurrence of x x x\mathrm{x}x is free in C C CCC, then this subconstraint acquires the meaning σ T σ T sigma-<=T\sigma \preceq \mathrm{T}σT. Thus, the constraint x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT is indeed an instantiation constraint, where the type scheme that is being instantiated is referred to by name. The constraint def x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC may be viewed as an explicit substitution of the type scheme σ σ sigma\sigmaσ for the name x within C C CCC. Later (Section 1.5), we use such explicit substitutions to supplant typing environments. That is, where Damas and Milner's type system augments the current typing environment (DM-ABS, DM-LET), we introduce a new def binding in the current constraint; where it looks up the current typing environment (DM-VAR), we employ an instantiation constraint. The point is that it is then up to a constraint solver to choose a strategy for reducing explicit substitutions-for instance, one might wish to simplify σ σ sigma\sigmaσ before substituting it for x x x\mathrm{x}x within C C CCC-whereas the use of environments in standard type systems such as DM and H M ( X ) H M ( X ) HM(X)\mathrm{HM}(X)HM(X) imposes an eager substitution strategy, which is inefficient and thus never literally implemented. The use of type scheme introduction and instantiation constraints allows separating constraint generation and constraint solving without compromising efficiency, or, in other words, without introducing a gap between the description of the type inference algorithm and its actual implementation. Although the algorithm that we plan to describe is not new, its description in terms of constraints is: to the best of our knowledge, the only close relative of our def constraints is to be found in (Gustavsson and Svenningsson, 2001b). Fähndrich, Rehof, and Das's instantiation constraints (2000) are also related, but may be recursive and are meant to be solved using a semi-unification procedure, as opposed to a unification algorithm extended with facilities for creating and instantiating type schemes, as in our case.
One consequence of introducing constraints inside type schemes is that some type schemes have no instances at all, or have instances only if a certain constraint holds. For instance, the type scheme σ = x [ σ = x [ sigma=AA x[\sigma=\forall x[σ=x[ bool = = === int].X, where the nullary type constructors int and bool have distinct interpretations, has no instances; that is, no constraint of the form σ T σ T sigma-<=T^(')\sigma \preceq \mathrm{T}^{\prime}σT has a solution. The type scheme σ = Z [ X = Y Z ] . Z σ = Z [ X = Y Z ] . Z sigma=AAZ[X=YrarrZ].Z\sigma=\forall \mathrm{Z}[\mathrm{X}=\mathrm{Y} \rightarrow \mathrm{Z}] . \mathrm{Z}σ=Z[X=YZ].Z has an instance only if X = Y Z X = Y Z X=YrarrZ\mathrm{X}=\mathrm{Y} \rightarrow \mathrm{Z}X=YZ holds for some Z; in other words, for every T , σ T T , σ T T^('),sigma-<=T^(')\mathrm{T}^{\prime}, \sigma \preceq \mathrm{T}^{\prime}T,σT entails Z Z EEZ\exists \mathrm{Z}Z. ( X = Y Z ) ( X = Y Z ) (X=YrarrZ)(\mathrm{X}=\mathrm{Y} \rightarrow \mathrm{Z})(X=YZ). (We define entailment on page 29.) We later prove that the constraint σ σ EE sigma\exists \sigmaσ is equivalent to Z . σ Z Z . σ Z EEZ.sigma-<=Z\exists \mathrm{Z} . \sigma \preceq \mathrm{Z}Z.σZ, where Z f t v ( σ ) Z f t v ( σ ) Z!in ftv(sigma)\mathrm{Z} \notin f t v(\sigma)Zftv(σ) (Exercise 1.3.23). That is, σ σ EE sigma\exists \sigmaσ expresses the requirement that σ σ sigma\sigmaσ have an instance. Type schemes that do not have an instance indicate a type error, so in many situations, one wishes to avoid them; for this reason, we often use the constraint form let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC, which requires σ σ sigma\sigmaσ to have an instance and at the same time associates it with the name x x x\mathrm{x}x. Because the def form is more primitive, it is easier to work with at a low level, but it is no longer explicitly used after Section 1.3; we always use let instead.
1.3.4 Definition: Environments Γ Γ Gamma\GammaΓ remain as in Definition 1.2.19, except DM type schemes S are replaced with constrained type schemes σ σ sigma\sigmaσ. We write d f p i ( Γ ) d f p i ( Γ ) dfpi(Gamma)d f p i(\Gamma)dfpi(Γ) for d p i ( Γ ) f p i ( Γ ) d p i ( Γ ) f p i ( Γ ) dpi(Gamma)uu fpi(Gamma)d p i(\Gamma) \cup f p i(\Gamma)dpi(Γ)fpi(Γ). We define def O/\varnothing in C = C C = C C=CC=CC=C and def Γ ; x : σ def Γ ; x : σ def Gamma;x:sigma\operatorname{def} \Gamma ; \mathrm{x}: \sigmadefΓ;x:σ in C = C = C=C=C= def Γ Γ Gamma\GammaΓ in def x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC. Similarly, we define let O/\varnothing in C = C C = C C=CC=CC=C and let Γ ; x Γ ; x Gamma;x\Gamma ; \mathrm{x}Γ;x : σ σ sigma\sigmaσ in C = C = C=C=C= let Γ Γ Gamma\GammaΓ in let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC. We define = = EE O/=\exists \varnothing== true and ( Γ ; x : σ ) = ( Γ ; x : σ ) = EE(Gamma;x:sigma)=\exists(\Gamma ; \mathrm{x}: \sigma)=(Γ;x:σ)= Γ def Γ Γ def Γ EE Gamma^^def Gamma\exists \Gamma \wedge \operatorname{def} \GammaΓdefΓ in σ σ EE sigma\exists \sigmaσ.
In order to establish or express certain laws of equivalence between constraints, we need constraint contexts. A context is a constraint with zero, one, or several holes, written []. The syntax of contexts is as follows:
C ::= | C | C C | x ¯ . C | def x : σ in C def x : x ¯ [ C ] . T in C C ::= | C | C C | x ¯ . C | def x : σ  in  C def x : x ¯ [ C ] . T  in  C C::=◻|C|C^^C|EE bar(x).C|def x:sigma" in "C∣def x:AA bar(x)[C].T" in "C\mathcal{C}::=\square|C| \mathcal{C} \wedge \mathcal{C}|\exists \overline{\mathrm{x}} . \mathcal{C}| \operatorname{def} \mathrm{x}: \sigma \text { in } \mathcal{C} \mid \operatorname{def} \mathrm{x}: \forall \overline{\mathrm{x}}[\mathcal{C}] . \mathrm{T} \text { in } CC::=|C|CC|x¯.C|defx:σ in Cdefx:x¯[C].T in C
The application of a constraint context C C C\mathcal{C}C to a constraint C C CCC, written C [ C ] C [ C ] C[C]\mathcal{C}[C]C[C], is defined in the usual way. Because a context may have any number of holes, C C CCC may disappear or be duplicated in the process. Because a hole may appear in the scope of a binder, some of C C CCC 's free type variables and free program identifiers may become bound in C [ C ] C [ C ] C[C]\mathcal{C}[C]C[C]. We write dtv ( C ) dtv ( C ) dtv(C)\operatorname{dtv}(\mathcal{C})dtv(C) and dpi ( C ) dpi ( C ) dpi(C)\operatorname{dpi}(\mathcal{C})dpi(C) for the sets of type variables and program identifiers, respectively, that C C C\mathcal{C}C may thus capture. We write let x : x ¯ [ C ] . T x : x ¯ [ C ] . T x:AA bar(x)[C].T\mathrm{x}: \forall \overline{\mathrm{x}}[\mathcal{C}] . Tx:x¯[C].T in C C CCC for x ¯ . C def x : x ¯ [ C ] . T x ¯ . C def x : x ¯ [ C ] . T EE bar(x).C^^def x:AA bar(x)[C].T\exists \overline{\mathrm{x}} . \mathcal{C} \wedge \operatorname{def} \mathrm{x}: \forall \overline{\mathrm{x}}[\mathcal{C}] . Tx¯.Cdefx:x¯[C].T in C C CCC. Being able to state such a definition is why we require multi-hole contexts. We let range over existential constraint contexts, defined by X ::= [ ] X ¯ . X X ::= [ ] X ¯ . X X::=[]∣EE bar(X).X\mathcal{X}::=[] \mid \exists \overline{\mathrm{X}} . \mathcal{X}X::=[]X¯.X.

Meaning

We have defined the syntax of constraints and given an informal description of their meaning. We now give a formal definition of the interpretation of constraints. We begin with the definition of a model:
1.3.5 Definition: For every kind κ κ kappa\kappaκ, let M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ be a nonempty set, whose elements are the ground types of kind κ κ kappa\kappaκ. In the following, t t ttt ranges over M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ, for some κ κ kappa\kappaκ that may be determined from the context. For every type constructor F F FFF of signature K κ K κ K=>kappaK \Rightarrow \kappaKκ, let F F FFF denote a total function from M K M K M_(K)\mathcal{M}_{K}MK into M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ, where the indexed product M K M K M_(K)\mathcal{M}_{K}MK is the set of all mappings of domain dom ( K ) dom ( K ) dom(K)\operatorname{dom}(K)dom(K) that map every d dom ( K ) d dom ( K ) d in dom(K)d \in \operatorname{dom}(K)ddom(K) to an element of M K ( d ) M K ( d ) M_(K(d))\mathcal{M}_{K(d)}MK(d). For every predicate P P PPP of signature κ 1 κ n κ 1 κ n kappa_(1)ox dots oxkappa_(n)=>*\kappa_{1} \otimes \ldots \otimes \kappa_{n} \Rightarrow \cdotκ1κn, let P P PPP denote a predicate on M κ 1 × × M κ n M κ 1 × × M κ n M_(kappa_(1))xx dots xxM_(kappa_(n))\mathcal{M}_{\kappa_{1}} \times \ldots \times \mathcal{M}_{\kappa_{n}}Mκ1××Mκn. We require the predicate <=\leq on M × M M × M M_(***)xxM_(***)\mathcal{M}_{\star} \times \mathcal{M}_{\star}M×M to be a partial order.
For the sake of convenience, we abuse notation and write F F FFF for both the type constructor and its interpretation; similarly for predicates. We freely assume that a binary equality predicate, whose interpretation is equality on M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ, is available at every kind κ κ kappa\kappaκ, so T 1 = T 2 T 1 = T 2 T_(1)=T_(2)\mathrm{T}_{1}=\mathrm{T}_{2}T1=T2, where T 1 T 1 T_(1)\mathrm{T}_{1}T1 and T 2 T 2 T_(2)\mathrm{T}_{2}T2 have kind κ κ kappa\kappaκ, is a well-formed constraint.
By varying the set of type constructors, the set of predicates, the set of ground types, and the interpretation of type constructors and predicates, one may define an entire family of related type systems. We informally refer to the collection of these choices as X X XXX. Thus, the type systems HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) and PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X), described in Sections 1.4 and 1.5, are parameterized by X X XXX.
The following examples give standard ways of defining the set of ground types and the interpretation of type constructors.
1.3.6 Example [Syntactic models]: For every kind κ κ kappa\kappaκ, let M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ consist of the closed types of kind κ κ kappa\kappaκ. Then, ground types are types that do not have any free type variables, and form the so-called Herbrand universe. Let every type constructor F F FFF be interpreted as itself. Models that define ground types and interpret type constructors in this manner are referred to as syntactic.
1.3.7 EXAmple [TREe models]: Let a path π π pi\piπ be a finite sequence of directions. The empty path is written ϵ ϵ epsilon\epsilonϵ and the concatenation of the paths π π pi\piπ and π π pi^(')\pi^{\prime}π is written π π π π pi*pi^(')\pi \cdot \pi^{\prime}ππ. Let a tree be a partial function t t ttt from paths to type constructors whose domain is nonempty and prefix-closed and such that, for every path π π pi\piπ in the domain of t t ttt, if the type constructor t ( π ) t ( π ) t(pi)t(\pi)t(π) has signature K κ K κ K=>kappaK \Rightarrow \kappaKκ, then π d dom ( t ) π d dom ( t ) pi*d in dom(t)\pi \cdot d \in \operatorname{dom}(t)πddom(t) is equivalent to d dom ( K ) d dom ( K ) d in dom(K)d \in \operatorname{dom}(K)ddom(K) and, furthermore, for every d dom ( K ) d dom ( K ) d in dom(K)d \in \operatorname{dom}(K)ddom(K), the type constructor t ( π d ) t ( π d ) t(pi*d)t(\pi \cdot d)t(πd) has image kind K ( d ) K ( d ) K(d)K(d)K(d). If π π pi\piπ is in the domain of t t ttt, then the subtree of t t ttt rooted at π π pi\piπ, written t / π t / π t//pit / \pit/π, is the partial function π t ( π π ) π t π π pi^(')|->t(pi*pi^('))\pi^{\prime} \mapsto t\left(\pi \cdot \pi^{\prime}\right)πt(ππ). A tree is finite if and only if it has finite domain. A tree is regular if and only if it has a finite number of distinct subtrees. Every finite tree is thus regular. Let M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ consist of the finite (resp. regular) trees t t ttt such that t ( ϵ ) t ( ϵ ) t(epsilon)t(\epsilon)t(ϵ) has image kind κ κ kappa\kappaκ : then, we have a finite (resp. regular) tree model.
If F F FFF has signature K κ K κ K=>kappaK \Rightarrow \kappaKκ, one may interpret F F FFF as the function that maps T M K T M K T inM_(K)T \in \mathcal{M}_{K}TMK to the ground type t M κ t M κ t inM_(kappa)t \in \mathcal{M}_{\kappa}tMκ defined by t ( ϵ ) = F t ( ϵ ) = F t(epsilon)=Ft(\epsilon)=Ft(ϵ)=F and t / d = T ( d ) t / d = T ( d ) t//d=T(d)t / d=T(d)t/d=T(d) for d dom ( T ) d dom ( T ) d in dom(T)d \in \operatorname{dom}(T)ddom(T), that is, the unique ground type whose head symbol is F F FFF and whose subtree rooted at d d ddd is T ( d ) T ( d ) T(d)T(d)T(d). Then, we have a free tree model. Please note that free finite tree models coincide with syntactic models, as defined in the previous example.
Rows (Section 1.11) are interpreted in a tree model, albeit not a free one. The following examples suggest different ways of interpreting the subtyping predicate.
1.3.8 EXample [Equality models]: The simplest way of interpreting the subtyping predicate is to let <=\leq denote equality on every M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ. Models that do so are referred to as equality models. When no predicate other than equality is available, we say that the model is equality-only.
1.3.9 Example [StRuctURal, NonstructURal SUbtyping]: Let a variance ν ν nu\nuν be a nonempty subset of { , + } { , + } {-,+}\{-,+\}{,+}, written - (contravariant), + (covariant), or ± ± +-\pm± (invariant) for short. Define the composition of two variances as an associative commutative operation with + as neutral element and such that = + = + --=+--=+=+ and ± = ± ± = ± ± = ± ± = ± +--=+-+-=+-\pm-= \pm \pm= \pm±=±±=±. Now, consider a free (finite or regular) tree model, where every direction d d ddd comes with a fixed variance ν ( d ) ν ( d ) nu(d)\nu(d)ν(d). Define the variance ν ( π ) ν ( π ) nu(pi)\nu(\pi)ν(π) of a path π π pi\piπ as the composition of the variances of its elements. Let <=\leqslant be a partial order on type constructors such that (i) if F 1 F 2 F 1 F 2 F_(1) <= F_(2)F_{1} \leqslant F_{2}F1F2 holds and F 1 F 1 F_(1)F_{1}F1 and F 2 F 2 F_(2)F_{2}F2 have signature K 1 κ 1 K 1 κ 1 K_(1)=>kappa_(1)K_{1} \Rightarrow \kappa_{1}K1κ1 and K 2 κ 2 K 2 κ 2 K_(2)=>kappa_(2)K_{2} \Rightarrow \kappa_{2}K2κ2, respectively, then K 1 K 1 K_(1)K_{1}K1 and K 2 K 2 K_(2)K_{2}K2 agree on the intersection of their domains and κ 1 κ 1 kappa_(1)\kappa_{1}κ1 and κ 2 κ 2 kappa_(2)\kappa_{2}κ2 coincide; and (ii) F 0 F 1 F 2 F 0 F 1 F 2 F_(0) <= F_(1) <= F_(2)F_{0} \leqslant F_{1} \leqslant F_{2}F0F1F2 implies dom ( F 0 ) dom ( F 2 ) dom ( F 1 ) dom F 0 dom F 2 dom F 1 dom(F_(0))nn dom(F_(2))sube dom(F_(1))\operatorname{dom}\left(F_{0}\right) \cap \operatorname{dom}\left(F_{2}\right) \subseteq \operatorname{dom}\left(F_{1}\right)dom(F0)dom(F2)dom(F1). Let + , + , <= ^(+), <= ^(-)\leqslant^{+}, \leqslant^{-}+,, and ± ± <= ^(+-)\leqslant^{ \pm}±stand for , , <= , >=\leqslant, \geqslant,, and = = ===, respectively. Then, define the interpretation of subtyping as follows: if t 1 , t 2 M κ t 1 , t 2 M κ t_(1),t_(2)inM_(kappa)t_{1}, t_{2} \in \mathcal{M}_{\kappa}t1,t2Mκ, let t 1 t 2 t 1 t 2 t_(1) <= t_(2)t_{1} \leq t_{2}t1t2 hold if and only if, for every path π dom ( t 1 ) dom ( t 2 ) , t 1 ( π ) ν ( π ) t 2 ( π ) π dom t 1 dom t 2 , t 1 ( π ) ν ( π ) t 2 ( π ) pi in dom(t_(1))nn dom(t_(2)),t_(1)(pi) <= ^(nu(pi))t_(2)(pi)\pi \in \operatorname{dom}\left(t_{1}\right) \cap \operatorname{dom}\left(t_{2}\right), t_{1}(\pi) \leqslant^{\nu(\pi)} t_{2}(\pi)πdom(t1)dom(t2),t1(π)ν(π)t2(π) holds. It is not difficult to check that <=\leq is a partial order on every M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ. The reader is referred to (Kozen, Palsberg, and Schwartzbach., 1995) for more details about this construction. Models that define subtyping in this manner are referred to as nonstructural subtyping models.
A simple nonstructural subtyping model is obtained by letting the directions domain and codomain be contra- and covariant, respectively, and introducing, in addition to the type constructor rarr\rightarrow, two type constructors _|_\perp and T T TTT of signature ***\star. This gives rise to a model where _|_\perp is the least ground type, T T TTT is the greatest ground type, and the arrow type constructor is, as usual, contravariant in its domain and covariant in its codomain.
A typical use of nonstructural subtyping is in type systems for records. One may, for instance, introduce a covariant direction content of kind ***\star, a kind diamond\diamond, a type constructor abs of signature diamond\diamond, a type constructor pre of signature { { {\{{ content } } |->***}=>diamond\mapsto \star\} \Rightarrow \diamond}, and let pre <=\leqslant abs. This gives rise to a model where pre t t t <=t \leqt abs holds for every t M t M t inM_(***)t \in \mathcal{M}_{\star}tM. This form of subtyping is called nonstructural because comparable ground types may have different shapes, such as _|_\perp and ⊥→ ⊥→ _|_ rarr TT\perp \rightarrow \top⊥→, or pre T T TTT and abs. Nonstructural subtyping has been studied, for example, in (Kozen, Palsberg, and Schwartzbach., 1995; Palsberg, Wand, and O'Keefe, 1997; Pottier, 2001b; Niehren and Priesnitz, 2003). Section 1.11 says more about typechecking operations on records.
An important particular case arises when any two type constructors related by <=\leqslant have the same arity. In that case, it is not difficult to show that any two ground types related by subtyping must have the same shape, that is, if t 1 t 2 t 1 t 2 t_(1) <= t_(2)t_{1} \leq t_{2}t1t2 holds, then dom ( t 1 ) dom t 1 dom(t_(1))\operatorname{dom}\left(t_{1}\right)dom(t1) and dom ( t 2 ) dom t 2 dom(t_(2))\operatorname{dom}\left(t_{2}\right)dom(t2) coincide. For this reason, such an interpretation of subtyping is usually referred to as atomic or structural subtyping. It has been studied in the finite (Mitchell, 1984, 1991b; Frey, 1997; Rehof, 1997; Kuncak and Rinard, 2003; Simonet, 2003) and regular (Tiuryn and Wand,
  1. cases. Structural subtyping is often used in automated program analyses that enrich standard types with atomic annotations without altering their shape.
Our last example suggests a predicate other than equality and subtyping.
1.3.10 Example [Conditional Constraints]: Consider a nonstructural subtyping model. For every type constructor F F FFF of image kind κ κ kappa\kappaκ and for every kind κ κ kappa^(')\kappa^{\prime}κ, let ( F ) ( F ) (F <= *=>* <= *)(F \leqslant \cdot \Rightarrow \cdot \leq \cdot)(F) be a predicate of signature κ κ κ κ κ κ kappa oxkappa^(')oxkappa^(')=>*\kappa \otimes \kappa^{\prime} \otimes \kappa^{\prime} \Rightarrow \cdotκκκ. Thus, if T 0 T 0 T_(0)\mathrm{T}_{0}T0 has kind κ κ kappa\kappaκ and T 1 , T 2 T 1 , T 2 T_(1),T_(2)\mathrm{T}_{1}, \mathrm{~T}_{2}T1, T2 have the same kind, then F T 0 T 1 T 2 F T 0 T 1 T 2 F <= T_(0)=>T_(1) <= T_(2)F \leqslant \mathrm{~T}_{0} \Rightarrow \mathrm{T}_{1} \leq \mathrm{T}_{2}F T0T1T2 is a wellformed constraint, called a conditional subtyping constraint. Its interpretation is defined as follows: if t 0 M κ t 0 M κ t_(0)inM_(kappa)t_{0} \in \mathcal{M}_{\kappa}t0Mκ and t 1 , t 2 M κ t 1 , t 2 M κ t_(1),t_(2)inM_(kappa^('))t_{1}, t_{2} \in \mathcal{M}_{\kappa^{\prime}}t1,t2Mκ, then F t 0 t 1 t 2 F t 0 t 1 t 2 F <= t_(0)=>t_(1) <= t_(2)F \leqslant t_{0} \Rightarrow t_{1} \leq t_{2}Ft0t1t2 holds if and only if F t 0 ( ϵ ) F t 0 ( ϵ ) F <= t_(0)(epsilon)F \leqslant t_{0}(\epsilon)Ft0(ϵ) implies t 1 t 2 t 1 t 2 t_(1) <= t_(2)t_{1} \leq t_{2}t1t2. In other words, if t 0 t 0 t_(0)t_{0}t0 's head symbol exceeds F F FFF according to the ordering on type constructors, then the subtyping constraint t 1 t 2 t 1 t 2 t_(1) <= t_(2)t_{1} \leq t_{2}t1t2 must hold; otherwise, the conditional constraint holds vacuously. Conditional constraints have been studied e.g. in (Reynolds, 1969a; Heintze, 1993; Aiken, Wimmers, and Lakshman, 1994; Pottier, 2000; Su and Aiken, 2001).
Many other kinds of constraints exist; see e.g. (Comon, 1993).
Throughout this chapter, we assume (unless stated otherwise) that the set of type constructors, the set of predicates, and the model-which, together, form the parameter X X XXX-are arbitrary and fixed.
As usual, the meaning of a constraint is a function of the meaning of its free type variables, which is given by a ground assignment. The meaning of free program identifiers may be defined as part of the constraint, if desired, using a def prefix, so it need not be given by a separate assignment.
1.3.11 Definition: A ground assignment ϕ ϕ phi\phiϕ is a total, kind-preserving mapping from V V V\mathcal{V}V into M M M\mathcal{M}M. Ground assignments are extended to types by ϕ ( F T 1 T n ) = ϕ F T 1 T n = phi(FT_(1)dotsT_(n))=\phi\left(F \mathrm{~T}_{1} \ldots \mathrm{T}_{n}\right)=ϕ(F T1Tn)= F ( ϕ ( T 1 ) , , ϕ ( T n ) ) F ϕ T 1 , , ϕ T n F(phi(T_(1)),dots,phi(T_(n)))F\left(\phi\left(\mathrm{T}_{1}\right), \ldots, \phi\left(\mathrm{T}_{n}\right)\right)F(ϕ(T1),,ϕ(Tn)). Then, for every type T T T\mathrm{T}T of kind κ , ϕ ( T ) κ , ϕ ( T ) kappa,phi(T)\kappa, \phi(\mathrm{T})κ,ϕ(T) is a ground type of kind κ κ kappa\kappaκ. Whether a constraint C C CCC holds under a ground assignment ϕ ϕ phi\phiϕ, written ϕ C ϕ C phi|--C\phi \vdash CϕC (read: ϕ ϕ phi\phiϕ satisfies C C CCC ), is defined by the rules in Figure 1-5. A constraint C C CCC is satisfiable if and only if ϕ C ϕ C phi|--C\phi \vdash CϕC holds for some ϕ ϕ phi\phiϕ. It is false if and only if ϕ def Γ ϕ def Γ phi|--def Gamma\phi \vdash \operatorname{def} \GammaϕdefΓ in C C CCC holds for no ground assignment ϕ ϕ phi\phiϕ and environment Γ Γ Gamma\GammaΓ.
Let us now explain the rules that define constraint satisfaction (Figure 15). They are syntax-directed: that is, to a given constraint, at most one rule applies. It is determined by the nature of the first construct that appears under a maximal def prefix. CM-TRUE states that a constraint of the form def Γ Γ Gamma\GammaΓ in true is a tautology, that is, holds under every ground assignment. No rule matches constraints of the form def Γ Γ Gamma\GammaΓ in false, which means that such constraints do not have a solution. CM-PREDICATE states that the meaning
Figure 1-5: Meaning of constraints
of a predicate application is given by the predicate's interpretation within the model. More specifically, if P P PPP 's signature is κ 1 κ n κ 1 κ n kappa_(1)ox dots oxkappa_(n)=>*\kappa_{1} \otimes \ldots \otimes \kappa_{n} \Rightarrow \cdotκ1κn, then, by wellformedness of the constraint, every T i T i T_(i)\mathrm{T}_{i}Ti is of kind κ i κ i kappa_(i)\kappa_{i}κi, so ϕ ( T i ) ϕ T i phi(T_(i))\phi\left(\mathrm{T}_{i}\right)ϕ(Ti) is a ground type in M κ i M κ i M_(kappa_(i))\mathcal{M}_{\kappa_{i}}Mκi. By Definition 1.3.5, P P PPP denotes a predicate on M κ 1 × × M κ n M κ 1 × × M κ n M_(kappa_(1))xx dots xxM_(kappa_(n))\mathcal{M}_{\kappa_{1}} \times \ldots \times \mathcal{M}_{\kappa_{n}}Mκ1××Mκn, so the rule's premise is mathematically well-formed. It is independent of Γ Γ Gamma\GammaΓ, which is natural, since a predicate application has no free program identifiers. CMAND requires each of the conjuncts to be valid in isolation. The information in Γ Γ Gamma\GammaΓ is made available to each branch. CM-ExisTs allows the type variables X X vec(X)\overrightarrow{\mathrm{X}}X to denote arbitrary ground types t t vec(t)\vec{t}t within C C CCC, independently of their image through ϕ ϕ phi\phiϕ. We implicitly require X X vec(X)\overrightarrow{\mathrm{X}}X and t t vec(t)\vec{t}t to have matching kinds, so that ϕ [ X t ] ϕ [ X t ] phi[ vec(X)|-> vec(t)]\phi[\overrightarrow{\mathrm{X}} \mapsto \vec{t}]ϕ[Xt] remains a kind-preserving ground assignment. The side condition x ¯ # f t v ( Γ ) x ¯ # f t v ( Γ ) bar(x)#ftv(Gamma)\overline{\mathrm{x}} \# \mathrm{ftv}(\Gamma)x¯#ftv(Γ) - which may always be satisfied by suitable α α alpha\alphaα-conversion of the constraint X ¯ . C X ¯ . C EE bar(X).C\exists \overline{\mathrm{X}} . CX¯.C - prevents free occurrences of the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ within Γ Γ Gamma\GammaΓ from being unduly affected. CM-INSTANCE concerns constraints of the form def Γ Γ Gamma\GammaΓ in x T x T x-<=T^(')\mathrm{x} \preceq \mathrm{T}^{\prime}xT. The constraint x T x T x-<=T^(')\mathrm{x} \preceq \mathrm{T}^{\prime}xT is turned into σ T σ T sigma-<=T^(')\sigma \preceq \mathrm{T}^{\prime}σT, where, according to the second premise, σ σ sigma\sigmaσ is Γ ( x ) Γ ( x ) Gamma(x)\Gamma(\mathrm{x})Γ(x). Please recall that constraints of such a form were introduced in Definition 1.3.3. The environment Γ Γ Gamma\GammaΓ is replaced with a suitable prefix of itself, namely Γ 1 Γ 1 Gamma_(1)\Gamma_{1}Γ1, so that the free program identifiers of σ σ sigma\sigmaσ retain their meaning.
It is intuitively clear that the constraints def x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C CCC and [ x σ ] C [ x σ ] C [x|->sigma]C[\mathrm{x} \mapsto \sigma] C[xσ]C have the same meaning, where the latter denotes the capture-avoiding substitution of σ σ sigma\sigmaσ for x throughout C C CCC. As a matter of fact, it would have been possible to use this equivalence as a definition of the meaning of def constraints, but the present style is pleasant as well. This confirms our (informal) claim that the def form is an explicit substitution form.
It is possible for a constraint to be neither satisfiable nor false. Consider, for instance, the constraint Z x z Z x z EEZ*x-<=z\exists \mathrm{Z} \cdot \mathrm{x} \preceq \mathrm{z}Zxz. Because the identifier x x x\mathrm{x}x is free, CMINSTANCE is not applicable, so the constraint is not satisfiable. Furthermore,
placing it within the context let x : X . X x : X . X x:AAX.X\mathrm{x}: \forall \mathrm{X} . \mathrm{X}x:X.X in \square makes it satisfied by every ground assignment, so it is not false. Here, the assertions " C C CCC is satisfiable" and " C C CCC is false" are opposite when f p i ( C ) = f p i ( C ) = fpi(C)=O/f p i(C)=\varnothingfpi(C)= holds, whereas in a standard first-order logic, they always are.
In a judgement of the form ϕ C ϕ C phi|--C\phi \vdash CϕC, the ground assignment ϕ ϕ phi\phiϕ applies to the free type variables of C C CCC. This is made precise by the following statements. In the second one, @\circ is composition and θ ( C ) θ ( C ) theta(C)\theta(C)θ(C) is the capture-avoiding application of the type substitution θ θ theta\thetaθ to C C CCC.
1.3.12 Lemma: If x ¯ # ftv ( C ) x ¯ # ftv ( C ) bar(x)#ftv(C)\overline{\mathrm{x}} \# \operatorname{ftv}(C)x¯#ftv(C) holds, then ϕ C ϕ C phi|--C\phi \vdash CϕC and ϕ [ x t ] C ϕ [ x t ] C phi[ vec(x)|-> vec(t)]|--C\phi[\overrightarrow{\mathrm{x}} \mapsto \vec{t}] \vdash Cϕ[xt]C are equivalent.
1.3.13 Lemma: ϕ θ C ϕ θ C phi@theta|--C\phi \circ \theta \vdash CϕθC and ϕ θ ( C ) ϕ θ ( C ) phi|--theta(C)\phi \vdash \theta(C)ϕθ(C) are equivalent.

Reasoning with constraints

Because constraints lie at the heart of our treatment of ML-the-type-system, most of our proofs involve establishing logical properties of constraints, that is, entailment or equivalence assertions. Let us first define these notions.
1.3.14 Definition: We write C 1 C 2 C 1 C 2 C_(1)⊩C_(2)C_{1} \Vdash C_{2}C1C2, and say that C 1 C 1 C_(1)C_{1}C1 entails C 2 C 2 C_(2)C_{2}C2, if and only if, for every ground assignment ϕ ϕ phi\phiϕ and for every environment Γ , ϕ def Γ Γ , ϕ def Γ Gamma,phi|--def Gamma\Gamma, \phi \vdash \operatorname{def} \GammaΓ,ϕdefΓ in C 1 C 1 C_(1)C_{1}C1 implies ϕ def Γ ϕ def Γ phi|--def Gamma\phi \vdash \operatorname{def} \GammaϕdefΓ in C 2 C 2 C_(2)C_{2}C2. We write C 1 C 2 C 1 C 2 C_(1)-=C_(2)C_{1} \equiv C_{2}C1C2, and say that C 1 C 1 C_(1)C_{1}C1 and C 2 C 2 C_(2)C_{2}C2 are equivalent, if and only if C 1 C 2 C 1 C 2 C_(1)⊩C_(2)C_{1} \Vdash C_{2}C1C2 and C 2 C 1 C 2 C 1 C_(2)⊩C_(1)C_{2} \Vdash C_{1}C2C1 hold.
This definition measures the strength of a constraint by the set of pairs ( ϕ , Γ ) ( ϕ , Γ ) (phi,Gamma)(\phi, \Gamma)(ϕ,Γ) that satisfy it, and considers a constraint stronger if fewer such pairs satisfy it. In other words, C 1 C 1 C_(1)C_{1}C1 entails C 2 C 2 C_(2)C_{2}C2 when C 1 C 1 C_(1)C_{1}C1 imposes stricter requirements on its free type variables and program identifiers than C 2 C 2 C_(2)C_{2}C2 does. We remark that C C CCC is false if and only if C C C-=C \equivC false holds. It is straightforward to check that entailment is reflexive and transitive and that -=\equiv is indeed an equivalence relation.
We immediately exploit the notion of constraint equivalence to define what it means for a type constructor to be covariant, contravariant, or invariant with respect to one of its parameters. Let F F FFF be a type constructor of signature κ 1 κ 1 kappa_(1)ox\kappa_{1} \otimesκ1 κ n κ κ n κ dots oxkappa_(n)=>kappa\ldots \otimes \kappa_{n} \Rightarrow \kappaκnκ. Let i { 1 , , n } . F i { 1 , , n } . F i in{1,dots,n}.Fi \in\{1, \ldots, n\} . Fi{1,,n}.F is covariant (resp. contravariant, invariant) with respect to its i th i th  i^("th ")i^{\text {th }}ith  parameter if and only if, for all types T 1 , , T n T 1 , , T n T_(1),dots,T_(n)\mathrm{T}_{1}, \ldots, \mathrm{T}_{n}T1,,Tn and T i T i T_(i)^(')\mathrm{T}_{i}^{\prime}Ti of appropriate kinds, the constraint F T 1 T i T n F T 1 T i T n F T 1 T i T n F T 1 T i T n FT_(1)dotsT_(i)dotsT_(n) <= FT_(1)dotsT_(i)^(')dotsT_(n)F \mathrm{~T}_{1} \ldots \mathrm{T}_{i} \ldots \mathrm{T}_{n} \leq F \mathrm{~T}_{1} \ldots \mathrm{T}_{i}^{\prime} \ldots \mathrm{T}_{n}F T1TiTnF T1TiTn is equivalent to T i T i T i T i T_(i) <= T_(i)^(')\mathrm{T}_{i} \leq \mathrm{T}_{i}^{\prime}TiTi (resp. T i T i , T i = T i T i T i , T i = T i T_(i)^(') <= T_(i),T_(i)=T_(i)^(')\mathrm{T}_{i}^{\prime} \leq \mathrm{T}_{i}, \mathrm{~T}_{i}=\mathrm{T}_{i}^{\prime}TiTi, Ti=Ti ). We let the reader check the following facts: (i) in an equality model, these three notions coincide; (ii) in an equality free tree model, every type constructor is invariant with respect to each of its parameters; and (iii) in a nonstructural subtyping model, if the direction d d ddd has been declared covariant (resp. contravariant, invariant), then every type constructor whose arity includes d d ddd is covariant (resp. contravariant,
invariant) with respect to d d ddd. In the following, we require the type constructor rarr\rightarrow to be contravariant with respect to its domain and covariant with respect to its codomain - a standard requirement in type systems with subtyping (TAPL Chapter 15). These properties are summed up by the following equivalence law:
(C-ARrow) T 1 T 2 T 1 T 2 T 1 T 1 T 2 T 2 (C-ARrow) T 1 T 2 T 1 T 2 T 1 T 1 T 2 T 2 {:(C-ARrow)T_(1)rarrT_(2) <= T_(1)^(')rarrT_(2)^(')-=T_(1)^(') <= T_(1)^^T_(2) <= T_(2)^('):}\begin{equation*} \mathrm{T}_{1} \rightarrow \mathrm{T}_{2} \leq \mathrm{T}_{1}^{\prime} \rightarrow \mathrm{T}_{2}^{\prime} \equiv \mathrm{T}_{1}^{\prime} \leq \mathrm{T}_{1} \wedge \mathrm{T}_{2} \leq \mathrm{T}_{2}^{\prime} \tag{C-ARrow} \end{equation*}(C-ARrow)T1T2T1T2T1T1T2T2
Please note that this is a high-level requirement about the interpretation of types and of the subtyping predicate. In an equality free tree model, for instance, it is always satisfied. In a nonstructural subtyping model, it boils down to requiring that the directions domain and codomain be declared contravariant and covariant, respectively. In the general case, we do not have any knowledge of the model, and cannot formulate a more precise requirement. Thus, it is up to the designer of the model to ensure that C-ARrow holds.
We also exploit the notion of constraint equivalence to define what it means for two type constructors to be incompatible. Two type constructors F 1 F 1 F_(1)F_{1}F1 and F 2 F 2 F_(2)F_{2}F2 with the same image kind are incompatible if and only if all constraints of the form F 1 T 1 F 2 T 2 F 1 T 1 F 2 T 2 F_(1) vec(T)_(1) <= F_(2) vec(T)_(2)F_{1} \overrightarrow{\mathrm{T}}_{1} \leq F_{2} \overrightarrow{\mathrm{T}}_{2}F1T1F2T2 and F 2 T 2 F 1 T 1 F 2 T 2 F 1 T 1 F_(2) vec(T)_(2) <= F_(1) vec(T)_(1)F_{2} \overrightarrow{\mathrm{T}}_{2} \leq F_{1} \overrightarrow{\mathrm{T}}_{1}F2T2F1T1 are false; then, we write F 1 F 2 F 1 F 2 F_(1)|><|F_(2)F_{1} \bowtie F_{2}F1F2. Please note that in an equality free tree model, any two distinct type constructors are incompatible. In the following, we often indicate that a newly introduced type constructor must be isolated. We implicitly require that, whenever each of F 1 F 1 F_(1)F_{1}F1 and F 2 F 2 F_(2)F_{2}F2 is isolated, F 1 F 1 F_(1)F_{1}F1 and F 2 F 2 F_(2)F_{2}F2 be incompatible. Thus, the notion of "isolation" provides a concise and modular way of stating a collection of incompatibility requirements. We consider the type constructor rarr\rightarrow isolated.
Entailment is preserved by arbitrary constraint contexts, as stated by the following theorem. As a result, constraint equivalence is a congruence. Throughout this chapter, these facts are often used implicitly.
1.3.15 Theorem [Congruence]: C 1 C 2 C 1 C 2 C_(1)⊩C_(2)C_{1} \Vdash C_{2}C1C2 implies C [ C 1 ] C [ C 2 ] C C 1 C C 2 C[C_(1)]⊩C[C_(2)]\mathcal{C}\left[C_{1}\right] \Vdash \mathcal{C}\left[C_{2}\right]C[C1]C[C2].
We now give a series of lemmas that provide useful entailment laws.
The following is a standard property of existential quantification.
1.3.16 Lemma: C x ¯ C x ¯ C⊩EE bar(x)C \Vdash \exists \overline{\mathrm{x}}Cx¯. C C CCC.
The following lemma states that any supertype of an instance of σ σ sigma\sigmaσ is also an instance of σ σ sigma\sigmaσ.
1.3.17 Lemma: σ T T T σ T σ T T T σ T sigma-<=T^^T <= T^(')⊩sigma-<=T^(')\sigma \preceq \mathrm{T} \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \Vdash \sigma \preceq \mathrm{T}^{\prime}σTTTσT.
The next lemma gives another interesting simplification law.
1.3.18 Lemma: X f t v ( T ) X f t v ( T ) X!in ftv(T)\mathrm{X} \notin f t v(\mathrm{~T})Xftv( T) implies X . ( X = T ) X . ( X = T ) EEX.(X=T)-=\exists \mathrm{X} .(\mathrm{X}=\mathrm{T}) \equivX.(X=T) true.
The following lemma states that, provided D D DDD is satisfied, the type T T T\mathrm{T}T is an instance of the constrained type scheme x ¯ [ D ] x ¯ [ D ] AA bar(x)[D]\forall \overline{\mathrm{x}}[D]x¯[D].T.
1.3.19 Lemma: D X ¯ [ D ] . T T D X ¯ [ D ] . T T D⊩AA bar(X)[D].T-<=TD \Vdash \forall \overline{\mathrm{X}}[D] . \mathrm{T} \preceq \mathrm{T}DX¯[D].TT.
This technical lemma helps justify Definition 1.3.21 below.
1.3.20 Lemma: Let Z ftv ( C , σ , T ) Z ftv ( C , σ , T ) Z!in ftv(C,sigma,T)\mathrm{Z} \notin \operatorname{ftv}(C, \sigma, \mathrm{T})Zftv(C,σ,T). Then, C σ T C σ T C⊩sigma-<=TC \Vdash \sigma \preceq \mathrm{T}CσT holds if and only if C T C T C^^T <=C \wedge \mathrm{T} \leqCT Z σ Z Z σ Z Z⊩sigma-<=Z\mathrm{Z} \Vdash \sigma \preceq \mathrm{Z}ZσZ holds.
It is useful to define what it means for a type scheme σ 1 σ 1 sigma_(1)\sigma_{1}σ1 to be more general than a type scheme σ 2 σ 2 sigma_(2)\sigma_{2}σ2. Our informal intent is for σ 1 σ 2 σ 1 σ 2 sigma_(1)-<=sigma_(2)\sigma_{1} \preceq \sigma_{2}σ1σ2 to mean: every instance of σ 2 σ 2 sigma_(2)\sigma_{2}σ2 is an instance of σ 1 σ 1 sigma_(1)\sigma_{1}σ1. In Definition 1.3.3, we have introduced the constraint form σ T σ T sigma-<=T\sigma \preceq \mathrm{T}σT as syntactic sugar. Similarly, one might wish to make σ 1 σ 2 σ 1 σ 2 sigma_(1)-<=sigma_(2)\sigma_{1} \preceq \sigma_{2}σ1σ2 a derived constraint form; however, this is impossible, because neither universal quantification nor implication are available in the constraint language. We can, however, exploit the fact that these logical connectives are implicit in entailment assertions by defining a judgement of the form C σ 1 σ 2 C σ 1 σ 2 C⊩sigma_(1)-<=sigma_(2)C \Vdash \sigma_{1} \preceq \sigma_{2}Cσ1σ2, whose meaning is: under the constraint C C CCC, σ 1 σ 1 sigma_(1)\sigma_{1}σ1 is more general than σ 2 σ 2 sigma_(2)\sigma_{2}σ2.
1.3.21 Definition: We write C σ 1 σ 2 C σ 1 σ 2 C⊩sigma_(1)-<=sigma_(2)C \Vdash \sigma_{1} \preceq \sigma_{2}Cσ1σ2 if and only if Z ftv ( C , σ 1 , σ 2 ) Z ftv C , σ 1 , σ 2 Z!in ftv(C,sigma_(1),sigma_(2))\mathrm{Z} \notin \operatorname{ftv}\left(C, \sigma_{1}, \sigma_{2}\right)Zftv(C,σ1,σ2) implies C σ 2 Z σ 1 z C σ 2 Z σ 1 z C^^sigma_(2)-<=Z⊩sigma_(1)-<=zC \wedge \sigma_{2} \preceq \mathrm{Z} \Vdash \sigma_{1} \preceq \mathrm{z}Cσ2Zσ1z. We write C σ 1 σ 2 C σ 1 σ 2 C⊩sigma_(1)-=sigma_(2)C \Vdash \sigma_{1} \equiv \sigma_{2}Cσ1σ2 when both C σ 1 σ 2 C σ 1 σ 2 C⊩sigma_(1)-<=sigma_(2)C \Vdash \sigma_{1} \preceq \sigma_{2}Cσ1σ2 and C σ 2 σ 1 C σ 2 σ 1 C⊩sigma_(2)-<=sigma_(1)C \Vdash \sigma_{2} \preceq \sigma_{1}Cσ2σ1 hold.
This notation is not ambiguous because the assertion C σ T C σ T C⊩sigma-<=TC \Vdash \sigma \preceq \mathrm{T}CσT, whose meaning was initially given by Definitions 1.3 .3 and 1.3.14, retains the same meaning under the new definition - this is shown by Lemma 1.3.20 above.
The next lemma provides a way of exploiting the ordering between type schemes introduced by Definition 1.3.21. It states that a type scheme occurs in contravariant position when it is within a def prefix. In other words, the more general the type scheme, the weaker the entire constraint.
1.3.22 Lemma: C σ 1 σ 2 C σ 1 σ 2 C⊩sigma_(1)-<=sigma_(2)C \Vdash \sigma_{1} \preceq \sigma_{2}Cσ1σ2 implies C def x : σ 2 C def x : σ 2 C^^def x:sigma_(2)C \wedge \operatorname{def} \mathrm{x}: \sigma_{2}Cdefx:σ2 in D def x : σ 1 D def x : σ 1 D⊩def x:sigma_(1)D \Vdash \operatorname{def} \mathrm{x}: \sigma_{1}Ddefx:σ1 in D D DDD.
The following exercise generalizes this result to let forms.
1.3.23 EXERcISE [ , ] [ , ] [******,↛][\star \star, \nrightarrow][,] : Prove that Z f t v ( σ ) Z f t v ( σ ) Z!in ftv(sigma)\mathrm{Z} \notin f t v(\sigma)Zftv(σ) implies σ Z . σ Z σ Z . σ Z EE sigma-=EEZ.sigma-<=Z\exists \sigma \equiv \exists \mathrm{Z} . \sigma \preceq \mathrm{Z}σZ.σZ. Explain why, as a result, C σ 1 σ 2 C σ 1 σ 2 C⊩sigma_(1)-<=sigma_(2)C \Vdash \sigma_{1} \preceq \sigma_{2}Cσ1σ2 implies C σ 2 σ 1 C σ 2 σ 1 C^^EEsigma_(2)⊩EEsigma_(1)C \wedge \exists \sigma_{2} \Vdash \exists \sigma_{1}Cσ2σ1. Use this fact to prove that C σ 1 σ 2 C σ 1 σ 2 C⊩sigma_(1)-<=sigma_(2)C \Vdash \sigma_{1} \preceq \sigma_{2}Cσ1σ2 implies C C C^^C \wedgeC let x : σ 2 σ 2 sigma_(2)\sigma_{2}σ2 in D D D⊩D \VdashD let x : σ 1 x : σ 1 x:sigma_(1)\mathrm{x}: \sigma_{1}x:σ1 in D D DDD.
The next lemma states that, modulo equivalence, the only constraint that constrains x x x\mathrm{x}x without explicitly referring to it is false.
1.3.24 Lemma: C x T C x T C⊩x-<=TC \Vdash \mathrm{x} \preceq \mathrm{T}CxT and x f p i ( C ) x f p i ( C ) x!in fpi(C)\mathrm{x} \notin f p i(C)xfpi(C) imply C C C-=C \equivC false.
The following lemma states that the more universal quantifiers are present, the more general the type scheme.
1.3.25 Lemma: let x : x ¯ [ C 1 ] . T x : x ¯ C 1 . T x:AA bar(x)[C_(1)].T\mathrm{x}: \forall \overline{\mathrm{x}}\left[C_{1}\right] . \mathrm{T}x:x¯[C1].T in C 2 C 2 C_(2)⊩C_{2} \VdashC2 let x : x ¯ Y ¯ [ C 1 ] . T x : x ¯ Y ¯ C 1 . T x:AA bar(x) bar(Y)[C_(1)].T\mathrm{x}: \forall \overline{\mathrm{x}} \overline{\mathrm{Y}}\left[C_{1}\right] . \mathrm{T}x:x¯Y¯[C1].T in C 2 C 2 C_(2)C_{2}C2.
Conversely, and perhaps surprisingly, it is sometimes possible to remove some type variables from the universal quantifier prefix of a type scheme without compromising its generality. This is the case when the value of these type variables is determined in a unique way. In short, C C CCC determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ if and only if, given the values of f t v ( C ) Y ¯ f t v ( C ) Y ¯ ftv(C)\\ bar(Y)f t v(C) \backslash \overline{\mathrm{Y}}ftv(C)Y¯ and given that C C CCC holds, it is possible to reconstruct, in a unique way, the values of Y ¯ Y ¯ bar(Y)\bar{Y}Y¯.
1.3.26 Definition: C C CCC determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ if and only if, for every environment Γ Γ Gamma\GammaΓ, two ground assignments that satisfy def Γ Γ Gamma\GammaΓ in C C CCC and that coincide outside Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ must coincide on Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ as well.
Two concrete instances of determinacy, one of which is valid only in free tree models, are given by Lemma 1.8.7 on page 82. Determinacy is exploited by the equivalence law C-LETALL in Figure 1-6.
We now give a toolbox of constraint equivalence laws. It is worth noting that they do not form a complete axiomatization of constraint equivalencein fact, they cannot, since the syntax and meaning of constraints is partly unspecified.
1.3.27 Theorem: All equivalence laws in Figure 1-6 hold.
Let us explain. C-AND and C-ANDAnd state that conjunction is commutative and associative. C-DuP states that redundant conjuncts may be freely added or removed, where a conjunct is redundant if and only if it is entailed by another conjunct. Throughout this chapter, these three laws are often used implicitly. C-ExEx and C-Ex* allow grouping consecutive existential quantifiers and suppressing redundant ones, where a quantifier is redundant if and only if it does not occur free within its scope. C-ExAnd allows conjunction and existential quantification to commute, provided no capture occurs; it is known as a scope extrusion law. When the rule is oriented from left to right, its side-condition may always be satisfied by suitable α α alpha\alphaα-conversion. C-ExTrans states that it is equivalent for a type T T T^(')\mathrm{T}^{\prime}T to be an instance of σ σ sigma\sigmaσ or to be a supertype of some instance of σ σ sigma\sigmaσ. We remark that the instances of a monotype are its supertypes, that is, by Definition 1.3.3, T T T T T-<=T^(')\mathrm{T} \preceq \mathrm{T}^{\prime}TT and T T T T T <= T^(')\mathrm{T} \leq \mathrm{T}^{\prime}TT are equivalent. As a result, specializing C-ExTRans to the case where σ σ sigma\sigmaσ is a monotype, we find that T T T T T <= T^(')\mathrm{T} \leq \mathrm{T}^{\prime}TT is equivalent to Z Z EEZ\exists \mathrm{Z}Z. ( T Z Z T T Z Z T (T <= Z^^Z <= T^('):}\left(\mathrm{T} \leq \mathrm{Z} \wedge \mathrm{Z} \leq \mathrm{T}^{\prime}\right.(TZZT ), for fresh Z Z Z\mathrm{Z}Z, a standard equivalence law. When oriented from left to right, it becomes an interesting simplification law: in a chain of subtyping constraints, an intermediate variable such as Z may be suppressed, provided it is local, as witnessed by the existential quantifier Z Z EEZ\exists \mathrm{Z}Z. C-INID states that, within the scope of the binding x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ, every free occurrence of x x x\mathrm{x}x may be safely replaced with σ σ sigma\sigmaσ. The restriction to free occurrences stems from the side-condition x d p i ( C ) x d p i ( C ) x!in dpi(C)\mathrm{x} \notin d p i(\mathcal{C})xdpi(C). When the
C 1 C 2 C 2 C 1 ( C 1 C 2 ) C 3 C 1 ( C 2 C 3 ) C 1 C 2 C 1 if C 1 C 2 X ¯ Y ¯ C X ¯ Y ¯ C x ¯ . C C ( X ¯ C 1 ) C 2 X ¯ ( C 1 C 2 ) (C-DuP) ( C E x E x ) ( C E x ) Z . ( σ Z Z T ) σ T if x ¯ # f t v ( C ) if x ¯ # f t v ( C 2 ) (C-ExAnd) let x : σ in C [ x T ] let x : σ in C [ σ T ] if x d p i ( C ) and dtv ( C ) # f tv ( σ ) and { x } d p i ( C ) # f p i ( σ ) let Γ in C Γ C if d p i ( Γ ) # f p i ( C ) (C-ExTrans) if z f t v ( σ , T ) let Γ ; x : x ¯ [ C 1 ] . T in C 2 let Γ ; x : x ¯ [ let Γ in C 1 ] . T in C 2 if X ¯ # f t v ( Γ ) and d p i ( Γ ) # f p i ( Γ ) if Y ¯ # ftv ( T ) true X ¯ ( X = T ) C 1 C 2 C 2 C 1 C 1 C 2 C 3 C 1 C 2 C 3 C 1 C 2 C 1  if  C 1 C 2 X ¯ Y ¯ C X ¯ Y ¯ C x ¯ . C C X ¯ C 1 C 2 X ¯ C 1 C 2  (C-DuP)  ( C E x E x ) C E x Z . σ Z Z T σ T  if  x ¯ # f t v ( C )  if  x ¯ # f t v C 2  (C-ExAnd)   let  x : σ  in  C x T  let  x : σ  in  C σ T  if  x d p i ( C )  and  dtv ( C ) # f tv ( σ )  and  { x } d p i ( C ) # f p i ( σ )  let  Γ  in  C Γ C  if  d p i ( Γ ) # f p i ( C )  (C-ExTrans)   if  z f t v σ , T  let  Γ ; x : x ¯ C 1 . T  in  C 2  let  Γ ; x : x ¯  let  Γ  in  C 1 . T  in  C 2  if  X ¯ # f t v ( Γ )  and  d p i ( Γ ) # f p i ( Γ )  if  Y ¯ # ftv ( T )  true  X ¯ ( X = T ) {:[C_(1)^^C_(2)-=C_(2)^^C_(1)],[(C_(1)^^C_(2))^^C_(3)-=C_(1)^^(C_(2)^^C_(3))],[C_(1)^^C_(2)-=C_(1)quad" if "C_(1)⊩C_(2)],[EE bar(X)*EE bar(Y)*C-=EE bar(X) bar(Y)*C],[EE bar(x).C-=C],[(EE bar(X)*C_(1))^^C_(2)-=EE bar(X)*(C_(1)^^C_(2))],[" (C-DuP) "],[(C-ExEx)],[(C-Ex^(**))],[EEZ.(sigma-<=Z^^Z <= T^('))-=sigma-<=T^(')],[" if " bar(x)#ftv(C)],[" if " bar(x)#ftv(C_(2))],[" (C-ExAnd) "],[" let "x:sigma" in "C[x-<=T^(')]-=" let "x:sigma" in "C[sigma-<=T^(')]],[" if "x!in dpi(C)" and "dtv(C)#f tv(sigma)" and "{x}uu dpi(C)#fpi(sigma)],[" let "Gamma" in "C-=EE Gamma^^C quad" if "dpi(Gamma)#fpi(C)],[" (C-ExTrans) "],[" if "z!in ftv(sigma,T^('))],[" let "Gamma;x:AA bar(x)[C_(1)].T" in "C_(2)-=" let "Gamma;x:AA bar(x)[" let "Gamma" in "C_(1)].T" in "C_(2)],[" if " bar(X)#ftv(Gamma)" and "dpi(Gamma)#fpi(Gamma)],[" if " bar(Y)#ftv(T)],[" true "-=EE bar(X)*( vec(X)= vec(T))]:}\begin{aligned} & C_{1} \wedge C_{2} \equiv C_{2} \wedge C_{1} \\ & \left(C_{1} \wedge C_{2}\right) \wedge C_{3} \equiv C_{1} \wedge\left(C_{2} \wedge C_{3}\right) \\ & C_{1} \wedge C_{2} \equiv C_{1} \quad \text { if } C_{1} \Vdash C_{2} \\ & \exists \overline{\mathrm{X}} \cdot \exists \overline{\mathrm{Y}} \cdot C \equiv \exists \overline{\mathrm{X}} \overline{\mathrm{Y}} \cdot C \\ & \exists \overline{\mathrm{x}} . C \equiv C \\ & \left(\exists \overline{\mathrm{X}} \cdot C_{1}\right) \wedge C_{2} \equiv \exists \overline{\mathrm{X}} \cdot\left(C_{1} \wedge C_{2}\right) \\ & \text { (C-DuP) } \\ & (\mathrm{C}-\mathrm{ExEx}) \\ & \left(\mathrm{C}-\mathrm{Ex}^{*}\right) \\ & \exists \mathrm{Z} .\left(\sigma \preceq \mathrm{Z} \wedge \mathrm{Z} \leq \mathrm{T}^{\prime}\right) \equiv \sigma \preceq \mathrm{T}^{\prime} \\ & \text { if } \overline{\mathrm{x}} \# \mathrm{ftv}(C) \\ & \text { if } \overline{\mathrm{x}} \# \mathrm{ftv}\left(C_{2}\right) \\ & \text { (C-ExAnd) } \\ & \text { let } \mathrm{x}: \sigma \text { in } \mathcal{C}\left[\mathrm{x} \preceq \mathrm{T}^{\prime}\right] \equiv \text { let } \mathrm{x}: \sigma \text { in } \mathcal{C}\left[\sigma \preceq \mathrm{T}^{\prime}\right] \\ & \text { if } \mathrm{x} \notin d p i(\mathcal{C}) \text { and } \operatorname{dtv}(\mathcal{C}) \# f \operatorname{tv}(\sigma) \text { and }\{\mathrm{x}\} \cup d p i(\mathcal{C}) \# f p i(\sigma) \\ & \text { let } \Gamma \text { in } C \equiv \exists \Gamma \wedge C \quad \text { if } d p i(\Gamma) \# f p i(C) \\ & \text { (C-ExTrans) } \\ & \text { if } \mathbf{z} \notin f t v\left(\sigma, \mathrm{T}^{\prime}\right) \\ & \text { let } \Gamma ; \mathrm{x}: \forall \overline{\mathrm{x}}\left[C_{1}\right] . \mathrm{T} \text { in } C_{2} \equiv \text { let } \Gamma ; \mathrm{x}: \forall \overline{\mathrm{x}}\left[\text { let } \Gamma \text { in } C_{1}\right] . \mathrm{T} \text { in } C_{2} \\ & \text { if } \overline{\mathrm{X}} \# f t v(\Gamma) \text { and } d p i(\Gamma) \# f p i(\Gamma) \\ & \text { if } \overline{\mathrm{Y}} \# \operatorname{ftv}(\mathrm{T}) \\ & \text { true } \equiv \exists \overline{\mathrm{X}} \cdot(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}) \end{aligned}C1C2C2C1(C1C2)C3C1(C2C3)C1C2C1 if C1C2X¯Y¯CX¯Y¯Cx¯.CC(X¯C1)C2X¯(C1C2) (C-DuP) (CExEx)(CEx)Z.(σZZT)σT if x¯#ftv(C) if x¯#ftv(C2) (C-ExAnd)  let x:σ in C[xT] let x:σ in C[σT] if xdpi(C) and dtv(C)#ftv(σ) and {x}dpi(C)#fpi(σ) let Γ in CΓC if dpi(Γ)#fpi(C) (C-ExTrans)  if zftv(σ,T) let Γ;x:x¯[C1].T in C2 let Γ;x:x¯[ let Γ in C1].T in C2 if X¯#ftv(Γ) and dpi(Γ)#fpi(Γ) if Y¯#ftv(T) true X¯(X=T)
Figure 1-6: Constraint equivalence laws
rule is oriented from left to right, its other side-conditions, which require the context let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in C C C\mathcal{C}C not to capture σ σ sigma\sigmaσ 's free type variables or free program identifiers, may always be satisfied by suitable α α alpha\alphaα-conversion. C-IN* complements the previous rule by allowing redundant let bindings to be simplified. We remark that C-INID and C-IN* provide a simple procedure for eliminating let forms. C-INAND states that the let form commutes with conjunction; CINAND* spells out a common particular case. C-INEX states that it commutes with existential quantification. When the rule is oriented from left to right, its side-condition may always be satisfied by suitable α α alpha\alphaα-conversion. C-LETLET states that let forms may commute, provided they bind distinct program identifiers and provided no free program identifiers are captured in the process. C-LETAnd allows the conjunct C 1 C 1 C_(1)C_{1}C1 to be moved outside of the constrained type scheme x ¯ [ C 1 C 2 ] x ¯ C 1 C 2 AA bar(x)[C_(1)^^C_(2)]\forall \overline{\mathrm{x}}\left[C_{1} \wedge C_{2}\right]x¯[C1C2].T, provided it does not involve any of the universally quantified type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. When oriented from left to right, the rule yields an important simplification law: indeed, taking an instance of X ¯ [ C 2 ] X ¯ C 2 AA bar(X)[C_(2)]\forall \overline{\mathrm{X}}\left[C_{2}\right]X¯[C2]. T is less expensive than taking an instance of x ¯ [ C 1 C 2 ] x ¯ C 1 C 2 AA bar(x)[C_(1)^^C_(2)]\forall \overline{\mathrm{x}}\left[C_{1} \wedge C_{2}\right]x¯[C1C2]. T, since the latter involves creating a copy of C 1 C 1 C_(1)C_{1}C1, while the former does not. C-LETDUP allows pushing a series of let bindings into a constrained type scheme, provided no capture occurs in the process. It is not used as a simplification law but as a tool in some proofs. C-LETEx states that it does not make any difference for a set of type variables Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ to be existentially quantified inside a constrained type scheme or part of the type scheme's universal quantifiers. Indeed, in either case, taking an instance of the type scheme means producing a constraint where Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ is existentially quantified. C-LETALL provides a restricted converse of Lemma 1.3.25. Together, C-LETEx and C-LETALL allow-in some situations only - to hoist existential quantifiers out of the left-hand side of a let form.
1.3.28 Example: C-LetAll would be invalid without the condition that X ¯ . C 1 X ¯ . C 1 EE bar(X).C_(1)\exists \overline{\mathrm{X}} . C_{1}X¯.C1 determines Y ¯ Y ¯ bar(Y)\bar{Y}Y¯. Consider, for instance, the constraint let x : Y . Y Y x : Y . Y Y x:AAY.YrarrY\mathrm{x}: \forall \mathrm{Y} . \mathrm{Y} \rightarrow \mathrm{Y}x:Y.YY in ( x ( x (x-<=(\mathrm{x} \preceq(x int rarr\rightarrow int x x ^^x-<=\wedge \mathrm{x} \preceqx bool rarr\rightarrow bool) (1), where int and bool are incompatible nullary type constructors. By C-INID and C-IN*, it is equivalent to Y Y EEY\exists \mathrm{Y}Y. Y Y Yrarr\mathrm{Y} \rightarrowY Y Y Y <=\mathrm{Y} \leqY int rarr\rightarrow int ) Y ) Y )^^EEY) \wedge \exists \mathrm{Y})Y. Y Y Y Y YrarrY <=\mathrm{Y} \rightarrow \mathrm{Y} \leqYY bool rarr\rightarrow bool ) ) ))), that is, true. Now, if C-LETALL was valid without its side-condition, then (1) would also be equivalent to Y Y EEY\exists \mathrm{Y}Y.let x : Y Y x : Y Y x:YrarrY\mathrm{x}: \mathrm{Y} \rightarrow \mathrm{Y}x:YY in ( x ( x (x-<=(\mathrm{x} \preceq(x int rarr\rightarrow int x x ^^x-<=\wedge \mathrm{x} \preceqx bool rarr\rightarrow bool), which by C-INID and C-IN* is Y Y EEY\exists \mathrm{Y}Y.(Y Y Y rarrY <=\rightarrow \mathrm{Y} \leqY int rarr\rightarrow int Y Y Y Y ^^YrarrY <=\wedge \mathrm{Y} \rightarrow \mathrm{Y} \leqYY bool rarr\rightarrow bool). By C-ARROW and C-ExTrans, this is int = = === bool, that is, false. Thus, the law is invalid in this case. It is easy to see why: when the type scheme σ σ sigma\sigmaσ contains a Y Y AAY\forall \mathrm{Y}Y quantifier, every instance of σ σ sigma\sigmaσ receives its own Y Y EEY\exists \mathrm{Y}Y quantifier, making Y a distinct (local) type variable; when Y Y Y\mathrm{Y}Y is not universally quantified, however, all instances of σ σ sigma\sigmaσ share references to a single (global) type variable Y. This corresponds to the
intuition that, in the former case, σ σ sigma\sigmaσ is polymorphic in Y, while in the latter case, it is monomorphic in Y. Lemma 1.3.25 states that, when deprived of its sidecondition, C-LETALL is only an entailment law, as opposed to an equivalence law. Similarly, it is in general invalid to hoist an existential quantifier out of the left-hand side of a let form. To see this, one may study the (equivalent) constraint let x : X [ Y . X = Y Y ] . X x : X [ Y . X = Y Y ] . X x:AAX[EEY.X=YrarrY].X\mathrm{x}: \forall \mathrm{X}[\exists \mathrm{Y} . \mathrm{X}=\mathrm{Y} \rightarrow \mathrm{Y}] . \mathrm{X}x:X[Y.X=YY].X in ( x ( x (x-<=(\mathrm{x} \preceq(x int rarr\rightarrow int x x ^^x-<=\wedge \mathrm{x} \preceqx bool rarr\rightarrow bool).
Naturally, in the above examples, the side-condition "true determines Y Y Y\mathrm{Y}Y " does not hold: by Definition 1.3.26, it is equivalent to "two ground assignments that coincide outside Y must coincide on Y as well', which is false as soon as M M M_(***)\mathcal{M}_{\star}M contains two distinct elements, such as int and bool here. There are cases, however, where the side-condition does hold. For instance, we later prove that X . Y = X . Y = EE X.Y=\exists X . Y=X.Y= int determines Y Y YYY; see Lemma 1.8.7. As a result, C-LETALL states that let x : X Y [ Y = x : X Y [ Y = x:AAXY[Y=\mathrm{x}: \forall \mathrm{XY}[\mathrm{Y}=x:XY[Y= int]. Y X Y X YrarrX\mathrm{Y} \rightarrow \mathrm{X}YX in C C CCC (1) is equivalent to Y Y EEY\exists \mathrm{Y}Y.let x : X [ Y = int ] . Y x : X [ Y = int ] . Y x:AAX[Y=int].Yrarr\mathrm{x}: \forall \mathrm{X}[\mathrm{Y}=\operatorname{int}] . \mathrm{Y} \rightarrowx:X[Y=int].Y X X X\mathrm{X}X in C C CCC (2), provided Y f t v ( C ) Y f t v ( C ) Y!in ftv(C)\mathrm{Y} \notin f t v(C)Yftv(C). The intuition is simple: because Y Y Y\mathrm{Y}Y is forced to assume the value int by the equation Y = Y = Y=\mathrm{Y}=Y= int, it makes no difference whether Y Y YYY is or isn't universally quantified. We remark that, by C-LETAND, (2) is equivalent to Y Y EEY\exists \mathrm{Y}Y. Y = int Y = int Y=int^^\mathrm{Y}=\operatorname{int} \wedgeY=int let x : X . Y X x : X . Y X x:AAX.YrarrX\mathrm{x}: \forall \mathrm{X} . \mathrm{Y} \rightarrow \mathrm{X}x:X.YX in C C CCC ) (3). In an efficient constraint solver, simplifying (1) into (3) before using C-INID to eliminate the let form is worthwhile, since doing so obviates the need for copying the type variable Y Y Y\mathrm{Y}Y and the equation Y = Y = Y=\mathrm{Y}=Y= int at every free occurrence of x x x\mathrm{x}x inside C C CCC.
C-LETSuB is the analogue of an environment strengthening lemma: roughly speaking, it states that, if a constraint holds under the assumption that x x x\mathrm{x}x has type X X X\mathrm{X}X, where X X X\mathrm{X}X is some supertype of T T T\mathrm{T}T, then it also holds under the assumption that x x x\mathrm{x}x has type T T T\mathrm{T}T. The last three rules deal with the equality predicate. C E Q C E Q C-EQ\mathrm{C}-\mathrm{EQ}CEQ states that it is valid to replace equals with equals; note the absence of a side-condition. When oriented from left to right, C-NAME allows introducing fresh names X X vec(X)\overrightarrow{\mathrm{X}}X for the types T T vec(T)\overrightarrow{\mathrm{T}}T. As always, X X vec(X)\overrightarrow{\mathrm{X}}X stands for a vector of distinct type variables. Of course, this makes sense only if the definition is not circular, that is, if the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ do not occur free within the terms T ¯ T ¯ bar(T)\overline{\mathrm{T}}T¯. When oriented from right to left, C-NAME may be viewed as a simplification law: it allows eliminating type variables whose value has been determined. C-NAMEEQ is a combination of C E Q C E Q C-EQ\mathrm{C}-\mathrm{EQ}CEQ and C N A M E C N A M E C-NAME\mathrm{C}-\mathrm{NAME}CNAME. It shows that applying an idempotent substitution to a constraint C C CCC amounts to placing C C CCC within a certain context. This immediately yields a proof of the following fact:
1.3.29 Lemma: C D C D C⊩DC \Vdash DCD implies θ ( C ) θ ( D ) θ ( C ) θ ( D ) theta(C)⊩theta(D)\theta(C) \Vdash \theta(D)θ(C)θ(D).
It is important to stress that, because the effect of a type substitution may be emulated using equations, conjunction, and existential quantification, there is no need ever to employ type substitutions in the definition of a constraintbased type system - it is possible, instead, to express every concept in terms
of constraints. In this chapter, we follow this route, and use type substitutions only when dealing with the type system DM, whose historical formulation is substitution-based.
So far, we have considered def a primitive constraint form and defined the let form in terms of def, conjunction, and existential quantification. The motivation for this approach was to simplify the proof of several constraint equivalence laws. However, in the remainder of this chapter, we work with let forms exclusively and never employ the def construct. As a result, it is possible, from here on, to discard def and pretend that let is primitive. This change in perspective offers us a few extra properties, stated in the next two lemmas. First, every constraint that contains a false subconstraint must be false. Second, no satisfiable constraint has a free program identifier.
1.3.30 Lemma: C [ C [ C[\mathcal{C}[C[ false ] ] ]-=] \equiv] false.
1.3.31 Lemma: If C C CCC is satisfiable, then f p i ( C ) = f p i ( C ) = fpi(C)=O/f p i(C)=\varnothingfpi(C)=.

Reasoning with constraints in an equality-only syntactic model

We have given a number of equivalence laws that are valid with respect to any interpretation of constraints, that is, within any model. However, an important special case is that of equality-only syntactic models. Indeed, in that specific setting, our constraint-based type systems are in close correspondence with DM. In short, we aim to prove that every satisfiable constraint admits a canonical solved form, to show that this notion corresponds to the standard concept of a most general unifier, and to establish a few technical properties of most general unifiers.
Thus, let us now assume that constraints are interpreted in an equality-only syntactic model. Let us further assume that, for every kind κ κ kappa\kappaκ, (i) there are at least two type constructors of image kind κ κ kappa\kappaκ and (ii) for every type constructor F F FFF of image kind κ κ kappa\kappaκ, there exists t M κ t M κ t inM_(kappa)t \in \mathcal{M}_{\kappa}tMκ such that t ( ϵ ) = F t ( ϵ ) = F t(epsilon)=Ft(\epsilon)=Ft(ϵ)=F. We refer to models that violate (i) or (ii) as degenerate; one may argue that such models are of little interest. The assumption that the model is nondegenerate is used in the proof of Lemmas 1.3.32 and 1.3.39.
Under these new assumptions, the interpretation of equality coincides with its syntax: every equation that holds in the model is in fact a syntactic truism. The converse, of course, holds in every model.
1.3.32 Lemma: If true T = T T = T ⊩T=T^(')\Vdash \mathrm{T}=\mathrm{T}^{\prime}T=T holds, then T T T\mathrm{T}T and T T T^(')\mathrm{T}^{\prime}T coincide.
In a syntactic model, ground types are finite trees. As a result, cyclic equations, such as X = X = X=\mathrm{X}=X= int X X rarrX\rightarrow \mathrm{X}X, are false.
1.3.33 Lemma: X f t v ( T ) X f t v ( T ) Xin ftv(T)\mathrm{X} \in f t v(\mathrm{~T})Xftv( T) and T V T V T!inV\mathrm{T} \notin \mathcal{V}TV imply ( X = T ) ( X = T ) (X=T)-=(\mathrm{X}=\mathrm{T}) \equiv(X=T) false.
A solved form is a conjunction of equations, where the left-hand sides are distinct type variables that do not appear in the right-hand sides, possibly surrounded by a number of existential quantifiers. Our definition is identical to Lassez, Maher and Marriott's solved forms (1988) and to Jouannaud and Kirchner's tree solved forms (1991), except we allow for prenex existential quantifiers, which are made necessary by our richer constraint language. Jouannaud and Kirchner also define d a g d a g dagd a gdag solved forms, which may be exponentially smaller. Because we define solved forms only for proof purposes, we need not take performance into account at this point. The efficient constraint solver presented in Section 1.8 does manipulate graphs, rather than trees. Type scheme introduction and instantiation constructs cannot appear within solved forms; indeed, provided the constraint at hand has no free program identifiers, they can be expanded away. For this reason, their presence in the constraint language has no impact on the results contained in this section.
1.3 .34 Definition: A solved form is of the form Y ¯ . ( X = T ) Y ¯ . ( X = T ) EE bar(Y).( vec(X)= vec(T))\exists \overline{\mathrm{Y}} .(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})Y¯.(X=T), where X ¯ # f t v ( T ¯ ) X ¯ # f t v ( T ¯ ) bar(X)#ftv( bar(T))\overline{\mathrm{X}} \# \mathrm{ftv}(\overline{\mathrm{T}})X¯#ftv(T¯).
Solved forms offer a convenient way of reasoning about constraints because every satisfiable constraint is equivalent to one. In other words, every constraint is equivalent to either a solved form or false. This property is established by the following lemma, whose proof provides a simple but effective procedure to rewrite a constraint to either a solved form or false.
1.3.35 Lemma: Let f p i ( C ) = f p i ( C ) = fpi(C)=O/f p i(C)=\varnothingfpi(C)=. Then, C C CCC is equivalent to either a solved form or false.
Proof: We first establish that every conjunction of equations is equivalent to either a solved form or false. To do so, we present Robinson's unification algorithm (1971) as a rewriting system. The system's invariant is to operate on constraints of the form either X = T ; C X = T ; C vec(X)= vec(T);C\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}} ; CX=T;C, where X ¯ # tv ( T ¯ , C ) X ¯ # tv ( T ¯ , C ) bar(X)#tv( bar(T),C)\overline{\mathrm{X}} \# \operatorname{tv}(\overline{\mathrm{T}}, C)X¯#tv(T¯,C) and the semicolon is interpreted as a distinguished conjunction, or false. We identify equations in C C CCC up to commutativity. The system is defined as follows:
X = T ; X = X C x = T ; C x = T ; F T 1 = F T 2 C x = T ; T 1 = T 2 C X = T ; F 1 T 1 = F 2 T 2 C false if F 1 F 2 X = T ; X = T C X = [ X T ] T X = T ; [ X T ] C if X f t v ( T ) x = T ; X = T C false if X f t v ( T ) and T V X = T ; X = X C x = T ; C x = T ; F T 1 = F T 2 C x = T ; T 1 = T 2 C X = T ; F 1 T 1 = F 2 T 2 C  false   if  F 1 F 2 X = T ; X = T C X = [ X T ] T X = T ; [ X T ] C  if  X f t v ( T ) x = T ; X = T C  false   if  X f t v ( T )  and  T V {:[ vec(X)= vec(T);,X=X^^C rarr, vec(x)= vec(T);C],[ vec(x)=, vec(T);,F vec(T)_(1)=F vec(T)_(2)^^C rarr,rarr vec(x)= vec(T); vec(T)_(1)= vec(T)_(2)^^C],[ vec(X)= vec(T);,F_(1) vec(T)_(1)=F_(2) vec(T)_(2)^^C rarr,rarr," false "],[,,," if "F_(1)!=F_(2)],[ vec(X)=, vec(T);,X=T^^C rarr,rarr vec(X)=[X|->T] vec(T)^^X=T;[X|->T]C],[,,," if "X!in ftv(T)],[ vec(x)=, vec(T);,X=T^^C rarr," false "],[,,," if "Xin ftv(T)" and "T!inV]:}\begin{array}{rrll} \overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}} ; & \mathrm{X}=\mathrm{X} \wedge C \rightarrow & \overrightarrow{\mathrm{x}}=\overrightarrow{\mathrm{T}} ; C \\ \overrightarrow{\mathrm{x}}= & \overrightarrow{\mathrm{T}} ; & F \overrightarrow{\mathrm{T}}_{1}=F \overrightarrow{\mathrm{T}}_{2} \wedge C \rightarrow & \rightarrow \overrightarrow{\mathrm{x}}=\overrightarrow{\mathrm{T}} ; \overrightarrow{\mathrm{T}}_{1}=\overrightarrow{\mathrm{T}}_{2} \wedge C \\ \overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}} ; & F_{1} \overrightarrow{\mathrm{T}}_{1}=F_{2} \overrightarrow{\mathrm{T}}_{2} \wedge C \rightarrow & \rightarrow & \text { false } \\ & & & \text { if } F_{1} \neq F_{2} \\ \overrightarrow{\mathrm{X}}= & \overrightarrow{\mathrm{T}} ; & \mathrm{X}=\mathrm{T} \wedge C \rightarrow & \rightarrow \overrightarrow{\mathrm{X}}=[\mathrm{X} \mapsto \mathrm{T}] \overrightarrow{\mathrm{T}} \wedge \mathrm{X}=\mathrm{T} ;[\mathrm{X} \mapsto \mathrm{T}] C \\ & & & \text { if } \mathrm{X} \notin f t v(\mathrm{~T}) \\ \overrightarrow{\mathrm{x}}= & \overrightarrow{\mathrm{T}} ; & \mathrm{X}=\mathrm{T} \wedge C \rightarrow & \text { false } \\ & & & \text { if } \mathrm{X} \in f t v(\mathrm{~T}) \text { and } \mathrm{T} \notin \mathcal{V} \end{array}X=T;X=XCx=T;Cx=T;FT1=FT2Cx=T;T1=T2CX=T;F1T1=F2T2C false  if F1F2X=T;X=TCX=[XT]TX=T;[XT]C if Xftv( T)x=T;X=TC false  if Xftv( T) and TV
It is straightforward to check that the above invariant is indeed preserved by the rewriting system. Let us check that constraint equivalence is also preserved. For the first rule, this is immediate. For the second and third rules, it
follows from the fact that we have assumed a free tree model; for the fourth rule, a consequence of C E Q C E Q C-EQ\mathrm{C}-\mathrm{EQ}CEQ; for the last rule, a consequence of Lemma 1.3.33. Furthermore, the system is terminating; this is witnessed by an ordering where false is the least element and where constraints of the form X = T ; C X = T ; C vec(X)= vec(T);C\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}} ; CX=T;C are ordered lexicographically, first by the number of type variables that appear free within C C CCC, second by the size of C C CCC. Last, a normal form for this rewriting system must be of the form either X = T X = T vec(X)= vec(T)\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}X=T; true, where (by the invariant) X ¯ # f t v ( T ¯ ) X ¯ # f t v ( T ¯ ) bar(X)#ftv( bar(T))\overline{\mathrm{X}} \# \mathrm{ftv}(\overline{\mathrm{T}})X¯#ftv(T¯) - that is, a solved form, or false.
Next, we show that the present lemma holds when C C CCC is built out of equations, conjunction, and existential quantification. Orienting C-ExAnd from left to right yields a terminating rewriting system that preserves constraint equivalence. The normal form of C C CCC must be Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯. C C C^(')C^{\prime}C, where C C C^(')C^{\prime}C is a conjunction of equations. By the previous result, C C C^(')C^{\prime}C is equivalent to either a solved form or false. Because solved forms are preserved by existential quantification and because Y ¯ Y ¯ EE bar(Y)\exists \bar{Y}Y¯.false is false, the same holds of C C CCC.
Last, we establish the result in the general case. We assume f p i ( C ) = f p i ( C ) = fpi(C)=O/f p i(C)=\varnothingfpi(C)= (1). Orienting C-INID and C-IN* from left to right yields a terminating rewriting system that preserves constraint equivalence. The normal form C C C^(')C^{\prime}C of C C CCC cannot contain any type scheme introduction forms; given (1), it cannot contain any instantiation forms either. Thus, C C C^(')C^{\prime}C is built out of equations, conjunction, and existential quantification only. By the previous result, it is equivalent to either a solved form or false, which implies that the same holds of C C CCC.
It is possible to impose further restrictions on solved forms. A solved form Y ¯ . ( X = T ) Y ¯ . ( X = T ) EE bar(Y).( vec(X)= vec(T))\exists \overline{\mathrm{Y}} .(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})Y¯.(X=T) is canonical if and only if its free type variables are exactly X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. This is stated, in an equivalent way, by the following definition.
1.3.36 Definition: A canonical solved form is a constraint of the form Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯. ( X = T ) ( X = T ) ( vec(X)= vec(T))(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})(X=T), where f t v ( T ¯ ) Y ¯ f t v ( T ¯ ) Y ¯ ftv( bar(T))sube bar(Y)f t v(\overline{\mathrm{T}}) \subseteq \overline{\mathrm{Y}}ftv(T¯)Y¯ and X ¯ # Y ¯ X ¯ # Y ¯ bar(X)# bar(Y)\overline{\mathrm{X}} \# \overline{\mathrm{Y}}X¯#Y¯.
1.3.37 Lemma: Every solved form is equivalent to a canonical solved form.
It is easy to describe the solutions of a canonical solved form: they are the ground refinements of the substitution [ X T ] [ X T ] [ vec(X)|-> vec(T)][\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}][XT].
1.3.38 Lemma: A ground assignment ϕ ϕ phi\phiϕ satisfies a canonical solved form Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯. ( X = T ) ( X = T ) ( vec(X)= vec(T))(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})(X=T) if and only if there exists a ground assignment ϕ ϕ phi^(')\phi^{\prime}ϕ such that ϕ ( X ) = ϕ ( T ) ϕ ( X ) = ϕ ( T ) phi( vec(X))=phi^(')( vec(T))\phi(\overrightarrow{\mathrm{X}})=\phi^{\prime}(\overrightarrow{\mathrm{T}})ϕ(X)=ϕ(T). As a result, every canonical solved form is satisfiable.
Proof: Let Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯. ( X = T ) ( X = T ) ( vec(X)= vec(T))(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})(X=T) be a canonical solved form. By CM-ExisTs and CMPredicate, ϕ ϕ phi\phiϕ satisfies Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯. ( X = T ) ( X = T ) ( vec(X)= vec(T))(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})(X=T) if and only if there exists t t vec(t)\vec{t}t such that ϕ [ Y ϕ [ Y phi[ vec(Y)|->\phi[\overrightarrow{\mathrm{Y}} \mapstoϕ[Y t ] ( X ) = ϕ [ Y t ] ( T ) t ] ( X ) = ϕ [ Y t ] ( T ) vec(t)]( vec(X))=phi[ vec(Y)|-> vec(t)]( vec(T))\vec{t}](\overrightarrow{\mathrm{X}})=\phi[\overrightarrow{\mathrm{Y}} \mapsto \vec{t}](\overrightarrow{\mathrm{T}})t](X)=ϕ[Yt](T). Thanks to the hypotheses X ¯ # Y ¯ X ¯ # Y ¯ bar(X)# bar(Y)\overline{\mathrm{X}} \# \overline{\mathrm{Y}}X¯#Y¯ and f t v ( T ¯ ) Y ¯ f t v ( T ¯ ) Y ¯ ftv( bar(T))sube bar(Y)f t v(\overline{\mathrm{T}}) \subseteq \overline{\mathrm{Y}}ftv(T¯)Y¯, this is equivalent to the existence of a ground assignment ϕ ϕ phi^(')\phi^{\prime}ϕ such that ϕ ( X ) = ϕ ( T ) ϕ ( X ) = ϕ ( T ) phi( vec(vec(X)))=phi^(')( vec(T))\phi(\overrightarrow{\vec{X}})=\phi^{\prime}(\overrightarrow{\mathrm{T}})ϕ(X)=ϕ(T).
Thus, for every ground assignment ϕ , ϕ [ X ϕ ( T ) ] ϕ , ϕ X ϕ ( T ) phi^('),phi^(')[ vec(X)|->phi^(')( vec(T))]\phi^{\prime}, \phi^{\prime}\left[\overrightarrow{\mathrm{X}} \mapsto \phi^{\prime}(\overrightarrow{\mathrm{T}})\right]ϕ,ϕ[Xϕ(T)] satisfies Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯. ( X = T ) ( X = T ) ( vec(X)= vec(T))(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})(X=T), which proves that this constraint is satisfiable.
Together, Lemmas 1.3.37 and 1.3.38 imply that every solved form is satisfiable. Our interest in canonical solved forms stems from the following fundamental property, which provides a syntactic characterization of entailment between canonical solved forms: if Y ¯ 1 ( X = T 1 ) Y ¯ 1 X = T 1 EE bar(Y)_(1)*(( vec(X))= vec(T)_(1))\exists \bar{Y}_{1} \cdot\left(\vec{X}=\vec{T}_{1}\right)Y¯1(X=T1) is more specific than Y ¯ 2 ( X = T 2 ) Y ¯ 2 X = T 2 EE bar(Y)_(2)*(( vec(X))= vec(T)_(2))\exists \bar{Y}_{2} \cdot\left(\vec{X}=\vec{T}_{2}\right)Y¯2(X=T2), in a logical sense, then T 1 T 1 vec(T)_(1)\vec{T}_{1}T1 refines T 2 T 2 vec(T)_(2)\overrightarrow{\mathrm{T}}_{2}T2, in a syntactic sense. The converse also holds (can you prove it?), but is not needed here.
1.3.39 LEMma: If Y ¯ 1 ( X = T 1 ) Y ¯ 2 ( X = T 2 ) Y ¯ 1 X = T 1 Y ¯ 2 X = T 2 EE bar(Y)_(1)*( vec(X)= vec(T)_(1))⊩EE bar(Y)_(2)*( vec(X)= vec(T)_(2))\exists \overline{\mathrm{Y}}_{1} \cdot\left(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}_{1}\right) \Vdash \exists \overline{\mathrm{Y}}_{2} \cdot\left(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}_{2}\right)Y¯1(X=T1)Y¯2(X=T2), where both sides are canonical solved forms, then there exists a type substitution φ φ varphi\varphiφ such that T 1 = φ ( T 2 ) T 1 = φ T 2 vec(T)_(1)=varphi( vec(T)_(2))\overrightarrow{\mathrm{T}}_{1}=\varphi\left(\overrightarrow{\mathrm{T}}_{2}\right)T1=φ(T2).
As a corollary, we find that canonical solved forms are unique up to α α alpha\alphaα conversion and up to C C C\mathrm{C}C-Ex*, provided the set X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ of their free type variables is fixed.
1.3.40 LEMma: If the canonical solved forms Y ¯ 1 ( X = T 1 ) Y ¯ 1 X = T 1 EE bar(Y)_(1)*( vec(X)= vec(T)_(1))\exists \bar{Y}_{1} \cdot\left(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}_{1}\right)Y¯1(X=T1) and Y ¯ 2 ( X = T 2 ) Y ¯ 2 X = T 2 EE bar(Y)_(2)*( vec(X)= vec(T)_(2))\exists \overline{\mathrm{Y}}_{2} \cdot\left(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}_{2}\right)Y¯2(X=T2) are equivalent, then there exists a renaming ρ ρ rho\rhoρ such that T 1 = ρ ( T 2 ) T 1 = ρ T 2 vec(T)_(1)=rho( vec(T)_(2))\overrightarrow{\mathrm{T}}_{1}=\rho\left(\overrightarrow{\mathrm{T}}_{2}\right)T1=ρ(T2).
Please note that the fact that the canonical solved forms Y ¯ 1 ( X 1 = T 1 ) Y ¯ 1 X 1 = T 1 EE bar(Y)_(1)*( vec(X)_(1)= vec(T)_(1))\exists \overline{\mathrm{Y}}_{1} \cdot\left(\overrightarrow{\mathrm{X}}_{1}=\overrightarrow{\mathrm{T}}_{1}\right)Y¯1(X1=T1) and Y ¯ 2 ( X 2 = T 2 ) Y ¯ 2 X 2 = T 2 EE bar(Y)_(2)*( vec(X)_(2)= vec(T)_(2))\exists \overline{\mathrm{Y}}_{2} \cdot\left(\overrightarrow{\mathrm{X}}_{2}=\overrightarrow{\mathrm{T}}_{2}\right)Y¯2(X2=T2) are equivalent does not imply that X ¯ 1 X ¯ 1 bar(X)_(1)\overline{\mathrm{X}}_{1}X¯1 and X ¯ 2 X ¯ 2 bar(X)_(2)\overline{\mathrm{X}}_{2}X¯2 coincide. Consider, for example, the canonical solved forms true and Y . ( X = Y ) Y . ( X = Y ) EEY.(X=Y)\exists \mathrm{Y} .(\mathrm{X}=\mathrm{Y})Y.(X=Y), which by C-NAMEEQ are equivalent. One might wish to further restrict canonical solved forms by requiring X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ to be the set of essential type variables of the constraint Y ¯ . ( X = T ) Y ¯ . ( X = T ) EE bar(Y).( vec(X)= vec(T))\exists \overline{\mathrm{Y}} .(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})Y¯.(X=T), that is, the set of the type variables that appear free in all equivalent constraints. However, as far our technical development is concerned, it seems more convenient not to do so. Instead, we show that it is possible to explicitly restrict or extend X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ when needed (Lemma 1.3.43).
The following definition allows entertaining a dual view of canonical solved forms, either as constraints or as idempotent type substitutions. The latter view is commonly found in standard treatments of unification (Lassez, Maher, and Marriott, 1988; Jouannaud and Kirchner, 1991) and in classic presentations of ML-the-type-system.
1.3.41 Definition: If [ X T ] [ X T ] [ vec(X)|-> vec(T)][\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}][XT] is an idempotent substitution of domain X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, let [ X T ] [ X T ] EE[ vec(X)|-> vec(T)]\exists[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}][XT] denote the canonical solved form Y ¯ . ( X = T ) Y ¯ . ( X = T ) EE bar(Y).( vec(X)= vec(T))\exists \overline{\mathrm{Y}} .(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})Y¯.(X=T), where Y ¯ = ftv ( T ¯ ) Y ¯ = ftv ( T ¯ ) bar(Y)=ftv( bar(T))\overline{\mathrm{Y}}=\operatorname{ftv}(\overline{\mathrm{T}})Y¯=ftv(T¯). An idempotent substitution θ θ theta\thetaθ is a most general unifier of the constraint C C CCC if and only if θ θ EE theta\exists \thetaθ and C C CCC are equivalent.
By definition, equivalent constraints admit the same most general unifiers. Many properties of canonical solved forms may be reformulated in terms of most general unifiers. By Lemmas 1.3.31, 1.3.35, and 1.3.37, every satisfiable constraint admits a most general unifier. By Lemma 1.3.40, if [ X T 1 ] X T 1 [ vec(X)|-> vec(T)_(1)]\left[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}_{1}\right][XT1] and
[ X T 2 ] X T 2 [ vec(X)|-> vec(T)_(2)]\left[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}_{2}\right][XT2] are most general unifiers of C C CCC, then T 1 T 1 vec(T)_(1)\overrightarrow{\mathrm{T}}_{1}T1 and T 2 T 2 vec(T)_(2)\overrightarrow{\mathrm{T}}_{2}T2 coincide up to a renaming. Conversely, if [ X T ] [ X T ] [ vec(X)|-> vec(T)][\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}][XT] is a most general unifier of C C CCC and if X ¯ # ρ X ¯ # ρ bar(X)#rho\overline{\mathrm{X}} \# \rhoX¯#ρ holds, then [ X ρ T ] [ X ρ T ] [ vec(X)|->rho vec(T)][\overrightarrow{\mathrm{X}} \mapsto \rho \overrightarrow{\mathrm{T}}][XρT] is also a most general unifier of C C CCC; indeed, these two substitutions correspond to α α alpha\alphaα-equivalent canonical solved forms.
The following result relates the substitution θ θ theta\thetaθ to the canonical solved form θ θ EE theta\exists \thetaθ, stating that every ground refinement of the former satisfies the latter.
1.3.42 Lemma: θ ( θ ) θ ( θ ) theta(EE theta)-=\theta(\exists \theta) \equivθ(θ) true.
The following lemma offers two technical results: the domain of a most general unifier of C C CCC may be restricted so as to become a subset of f t v ( C ) f t v ( C ) ftv(C)f t v(C)ftv(C); it may also be extended to include arbitrary fresh variables. The next lemma is a simple corollary.
1.3.43 Lemma: Let θ θ theta\thetaθ be a most general unifier of C C CCC. If Z ¯ # f tv ( C ) Z ¯ # f tv ( C ) bar(Z)#f tv(C)\overline{\mathrm{Z}} \# f \operatorname{tv}(C)Z¯#ftv(C), then θ z ¯ θ z ¯ theta\\ bar(z)\theta \backslash \overline{\mathrm{z}}θz¯ is also a most general unifier of C C CCC. If z ¯ # θ z ¯ # θ bar(z)#theta\overline{\mathrm{z}} \# \thetaz¯#θ, then there exists a most general unifier of C C CCC that extends θ θ theta\thetaθ and whose domain is dom ( θ ) z ¯ dom ( θ ) z ¯ dom(theta)uu bar(z)\operatorname{dom}(\theta) \cup \overline{\mathrm{z}}dom(θ)z¯.
1.3.44 Lemma: Let θ 1 θ 1 theta_(1)\theta_{1}θ1 and θ 2 θ 2 theta_(2)\theta_{2}θ2 be most general unifiers of C C CCC. Let x ¯ = dom ( θ 1 ) x ¯ = dom θ 1 bar(x)=dom(theta_(1))nn\overline{\mathrm{x}}=\operatorname{dom}\left(\theta_{1}\right) \capx¯=dom(θ1) dom ( θ 2 ) dom θ 2 dom(theta_(2))\operatorname{dom}\left(\theta_{2}\right)dom(θ2). Then, θ 1 ( X ¯ ) θ 1 ( X ¯ ) theta_(1)( bar(X))\theta_{1}(\overline{\mathrm{X}})θ1(X¯) and θ 2 ( X ¯ ) θ 2 ( X ¯ ) theta_(2)( bar(X))\theta_{2}(\overline{\mathrm{X}})θ2(X¯) coincide up to a renaming.
Our last technical result relates the most general unifiers of C C CCC with the most general unifiers of EE\exists X.C. It states that the former are extensions of the latter. Furthermore, under a few freshness conditions, every most general unifier of EE\exists x. C C CCC may be extended to yield a most general unifier of C C CCC.
1.3.45 Lemma: If θ θ theta\thetaθ is a most general unifier of C C CCC, then θ X θ X theta\\X\theta \backslash \mathrm{X}θX is a most general unifier of x . C x . C EEx.C\exists \mathrm{x} . Cx.C. Conversely, if θ θ theta\thetaθ is a most general unifier of X . C X . C EEX.C\exists \mathrm{X} . CX.C and x # θ x # θ x#theta\mathrm{x} \# \thetax#θ and f t v ( f t v ( ftv(EEf t v(\existsftv( X.C C ) dom ( θ ) C ) dom ( θ ) C)dom(theta)C) \operatorname{dom}(\theta)C)dom(θ), then there exists a type substitution θ θ theta^(')\theta^{\prime}θ such that θ θ theta^(')\theta^{\prime}θ extends θ , θ θ , θ theta,theta^(')\theta, \theta^{\prime}θ,θ is a most general unifier of C C CCC, and dom ( θ ) = dom ( θ ) X dom θ = dom ( θ ) X dom(theta^('))=dom(theta)uuX\operatorname{dom}\left(\theta^{\prime}\right)=\operatorname{dom}(\theta) \cup \mathrm{X}dom(θ)=dom(θ)X.

1.4 HM ( X ) 1.4 HM ( X ) 1.4 HM(X)1.4 \operatorname{HM}(X)1.4HM(X)

Constraint-based type systems appeared during the 1980s (Mitchell, 1984; Fuh and Mishra, 1988) and were widely studied during the following decade (Curtis, 1990; Aiken and Wimmers, 1993; Jones, 1994a; Smith, 1994; Palsberg, 1995; Trifonov and Smith, 1996; Fähndrich, 1999; Pottier, 2001b). We now present one such system, baptized HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) because it is a parameterized extension of Hindley and Milner's type discipline; the meaning of the parameter X X XXX was explained on page 24. Its original description is due to Odersky, Sulzmann, and Wehr (1999a). Since then, it has been completed in a number of works (Sulzmann, Müller, and Zenger, 1999; Sulzmann, 2000; Pottier, 2001a;
Skalka and Pottier, 2002). Each of these presentations introduces minor variations. Here, we follow (Pottier, 2001a), which is itself inspired by (Sulzmann, Müller, and Zenger, 1999).

Definition

Our presentation of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) relies on the constraint language introduced in section 1.3. Technically, our approach of constraints is more direct than that of (Odersky, Sulzmann, and Wehr, 1999a). We interpret constraints within a model, give conjunction and existential quantification their standard meaning, and derive a number of equivalence laws (Section 1.3). Odersky et al., on the other hand, do not explicitly rely on a logical interpretation; instead, they axiomatize constraint equivalence, that is, they consider a number of equivalence laws as axioms. Thus, they ensure that their high-level proofs, such as type soundness and correctness and completeness of type inference, are independent of the low-level details of the logical interpretation of constraints. Their approach is also more general, since it allows dealing with other logical interpretations - such as "open-world" interpretations, where constraints are interpreted not within a fixed model, but within a family of extensions of a "current" model. In this chapter, we have avoided this extra layer of abstraction, for the sake of definiteness; however, the changes required to adopt Odersky et al.'s approach would not be extensive, since the forthcoming proofs do indeed rely mostly on constraint equivalence laws, rather than on low-level details of the logical interpretation of constraints.
Another slight departure from Odersky et al.'s work lies in the fact that we have enriched the constraint language with type scheme introduction and instantiation forms, which were absent in the original presentation of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). To prevent this addition from affecting HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), we require the constraints that appear in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) typing judgements to have no free program identifiers. Please note that this does not prevent them from containing let forms; we shall in fact exploit this feature when establishing an equivalence between HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) and the type system presented in section 1.5 , where the new constraint forms are effectively used.
The type system HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) consists of a four-place judgement whose parameters are a constraint C C CCC, an environment Γ Γ Gamma\GammaΓ, an expression t t ttt, and a type scheme σ σ sigma\sigmaσ. A judgement is written C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ and is read: under the assumptions C C CCC and Γ Γ Gamma\GammaΓ, the expression t t ttt has type σ σ sigma\sigmaσ. One may view C C CCC as an assumption about the judgement's free type variables and Γ Γ Gamma\GammaΓ as an assumption about t's free program identifiers. Please recall that Γ Γ Gamma\GammaΓ now contains constrained type schemes, and that σ σ sigma\sigmaσ is a constrained type scheme.
We would like the validity of a typing judgement to depend not on the
Figure 1-7: Typing rules for HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X)
syntax, but only on the meaning of its constraint assumption. We enforce this point of view by considering judgements equal modulo equivalence of their constraint assumptions. In other words, the typing judgements C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ and D , Γ t : σ D , Γ t : σ D,Gamma|--t:sigmaD, \Gamma \vdash \mathrm{t}: \sigmaD,Γt:σ are considered identical when C D C D C-=DC \equiv DCD holds. As a result, it does not make sense to analyze the syntax of a judgement's constraint assumption. A judgement is valid, or holds, if and only if it is derivable via the rules given in Figure 1-7. Please note that a valid judgement may involve an unsatisfiable constraint. A program t is well-typed within the environment Γ Γ Gamma\GammaΓ if and only if a judgement of the form C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ holds for some satisfiable constraint C C CCC.
Let us now explain the rules. Like DM-VAR, HMX-VAR looks up the environment to determine the type scheme associated with the program identifier x x x\mathrm{x}x. The constraint C C CCC that appears in the conclusion must be strong enough to guarantee that σ σ sigma\sigmaσ has an instance; this is expressed by the second premise. This technical requirement is used in the proof of Lemma 1.4.1. HMX-ABS, HMX-APP, and HMX-LET are identical to DM-ABS, DM-APP, and DM-LET, respectively, except that the assumption C C CCC is made available to every subderivation. We recall that the type T T T\mathrm{T}T may be viewed as the type scheme AA O/\forall \varnothing [true].T (Definitions 1.2.18 and 1.3.2). As a result, types form a subset of type schemes, which implies that Γ ; z : T Γ ; z : T Gamma;z:T\Gamma ; z: \mathrm{T}Γ;z:T is a well-formed environment and C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T a well-formed typing judgement. To understand HMX-GEN, it is best to first consider the particular case where C C CCC is true. This yields the following, simpler rule:
(HMX-GEN') D , Γ t : T x ¯ # f t v ( Γ ) x ¯ D , Γ t : X ¯ [ D ] . T (HMX-GEN') D , Γ t : T x ¯ # f t v ( Γ ) x ¯ D , Γ t : X ¯ [ D ] . T {:(HMX-GEN')(D,Gamma|--t:Tquad bar(x)#ftv(Gamma))/(EE bar(x)*D,Gamma|--t:AA bar(X)[D].T):}\begin{equation*} \frac{D, \Gamma \vdash \mathrm{t}: \mathrm{T} \quad \overline{\mathrm{x}} \# f t v(\Gamma)}{\exists \overline{\mathrm{x}} \cdot D, \Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}}[D] . \mathrm{T}} \tag{HMX-GEN'} \end{equation*}(HMX-GEN')D,Γt:Tx¯#ftv(Γ)x¯D,Γt:X¯[D].T
The second premise is identical to that of DM-GEN: the type variables that are generalized must not occur free within the environment. The conclusion forms the type scheme X ¯ [ D ] X ¯ [ D ] AA bar(X)[D]\forall \overline{\mathrm{X}}[D]X¯[D].T, where the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ have become universally quantified, but are still subject to the constraint D D DDD. Please note that the type variables that occur free in D D DDD may include not only X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, but also other type variables, typically free in Γ Γ Gamma\GammaΓ. The rule's conclusion carries the constraint X ¯ . D X ¯ . D EE bar(X).D\exists \overline{\mathrm{X}} . DX¯.D, thus recording the requirement that the newly formed type scheme should have an instance; again, this is used in the proof of Lemma 1.4.1. HMX-GEN may be viewed as a more liberal version of HMX-GEN', whereby part of the current constraint, namely C C CCC, need not be copied if it does not concern the type variables that are being generalized, namely X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. This optimization is important in practice, because C C CCC may be very large. An intuitive explanation for its correctness is given by the constraint equivalence law C C C\mathrm{C}C LETAND, which expresses the same optimization in terms of let constraints. Because HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) does not use let constraints, the optimization is hard-wired into the typing rule. HMX-INST allows taking an instance of a type scheme. The reader may be surprised to find that, contrary to DM-INST, it does not involve a type substitution. Instead, the rule merely drops the universal quantifier, which amounts to applying the identity substitution X X X X vec(X)|-> vec(X)\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{X}}XX. One should recall, however, that type schemes are considered equal modulo α α alpha\alphaα-conversion, so it is possible to rename the type scheme's universal quantifiers prior to using HMX-INST. The reason why this provides sufficient expressive power appears in the proof of Theorem 1.4.7 below. The constraint D D DDD carried by the type scheme is recorded as part of the current constraint in HMX-INST's conclusion. The subsumption rule HMX-SUB allows a type T T T\mathrm{T}T to be replaced at any time with an arbitrary supertype T T T^(')\mathrm{T}^{\prime}T. Because both T T T\mathrm{T}T and T T T^(')\mathrm{T}^{\prime}T may have free type variables, whether T T T T T <= T^(')\mathrm{T} \leq \mathrm{T}^{\prime}TT holds depends on the current assumption C C CCC, which is why the rule's second premise is an entailment assertion. An operational explanation of HMX-SUB is that it requires all uses of subsumption to be explicitly recorded in the current constraint. Please note that HMX-SUB remains a useful and necessary rule even when subtyping is interpreted as equality: then, it allows exploiting the type equations found in C C CCC. Last, HMXEXISTS allows the type variables that occur only within the current constraint to become existentially quantified. As a result, these type variables no longer occur free in the rule's conclusion; in other words, they have become local to the subderivation rooted at the premise. One may prove that the presence of HMX-EXISTS in the type system does not augment the set of well-typed programs, but does augment the set of valid typing judgements; it is a pleasant technical convenience. Indeed, because judgements are considered equal modulo constraint equivalence, constraints may be transparently simplified at any time. (By simplifying a constraint, we mean replacing it with an equiva-
lent constraint whose syntactic representation is considered simpler.) Bearing this fact in mind, one finds that an effect of rule HMX-Exists is to enable more simplifications: because constraint equivalence is a congruence, C D C D C-=DC \equiv DCD implies X ¯ . C X ¯ . D X ¯ . C X ¯ . D EE bar(X).C-=EE bar(X).D\exists \overline{\mathrm{X}} . C \equiv \exists \overline{\mathrm{X}} . DX¯.CX¯.D, but the converse does not hold in general. For instance, there is in general no way of simplifying the judgement X Y Z , Γ t : σ X Y Z , Γ t : σ X <= Y <= Z,Gamma|--t:sigma\mathrm{X} \leq \mathrm{Y} \leq \mathrm{Z}, \Gamma \vdash \mathrm{t}: \sigmaXYZ,Γt:σ, but if it is known that Y Y Y\mathrm{Y}Y does not appear free in Γ Γ Gamma\GammaΓ or σ σ sigma\sigmaσ, then HMX-EXISTS allows deriving Y . ( X Y Z ) , Γ t : σ Y . ( X Y Z ) , Γ t : σ EEY.(X <= Y <= Z),Gamma|--t:sigma\exists \mathrm{Y} .(\mathrm{X} \leq \mathrm{Y} \leq \mathrm{Z}), \Gamma \vdash \mathrm{t}: \sigmaY.(XYZ),Γt:σ, which is the same judgement as x Z , Γ t : σ x Z , Γ t : σ x <= Z,Gamma|--t:sigma\mathrm{x} \leq \mathrm{Z}, \Gamma \vdash \mathrm{t}: \sigmaxZ,Γt:σ. Thus, an interesting simplification has been enabled. Please note that X Y Z X Z X Y Z X Z X <= Y <= Z-=X <= Z\mathrm{X} \leq \mathrm{Y} \leq \mathrm{Z} \equiv \mathrm{X} \leq \mathrm{Z}XYZXZ does not hold, while, according to C-ExTRans, Y Y EEY\exists \mathrm{Y}Y. X Y Z ) X Z X Y Z ) X Z X <= Y <= Z)-=X <= Z\mathrm{X} \leq \mathrm{Y} \leq \mathrm{Z}) \equiv \mathrm{X} \leq \mathrm{Z}XYZ)XZ does.
We now establish a few simple properties of the type system HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). Our first lemma is a minor technical property.
1.4.1 Lemma: C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ implies C σ C σ C⊩EE sigmaC \Vdash \exists \sigmaCσ.
The next lemma states that strengthening a judgement's constraint assumption preserves its validity. In other words, weakening a judgement preserves its validity. It is worth noting that in traditional presentations, which rely more heavily on type substitutions, the analogue of this result is a type substitution lemma; see for instance (Tofte, 1988, Lemma 2.7), (Leroy, 1992, Proposition 1.2), (Skalka and Pottier, 2002, Lemma 3.4). Here, the lemma further states that weakening a judgement does not alter the shape of its derivation, a useful property when reasoning by induction on type derivations.
1.4.2 Lemma [Weakening]: If C C C C C^(')⊩CC^{\prime} \Vdash CCC, then every derivation of C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ may be turned into a derivation of C , Γ t : σ C , Γ t : σ C^('),Gamma|--t:sigmaC^{\prime}, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ with the same shape.
Proof: The proof is by structural induction on a derivation of C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ. In each proof case, we adopt the notations of Figure 1-7.
  • Case HMX-VAR. The rule's conclusion is C , Γ x : σ C , Γ x : σ C,Gamma|--x:sigmaC, \Gamma \vdash \mathrm{x}: \sigmaC,Γx:σ. Its premises are Γ ( x ) = σ Γ ( x ) = σ Gamma(x)=sigma\Gamma(\mathrm{x})=\sigmaΓ(x)=σ (1) and C σ C σ C⊩EE sigmaC \Vdash \exists \sigmaCσ (2). By hypothesis, we have C C C C C^(')⊩CC^{\prime} \Vdash CCC (3). By transitivity of entailment, (3) and (2) imply C σ C σ C^(')⊩EE sigmaC^{\prime} \Vdash \exists \sigmaCσ (4). By HMX-VAR, (1) and (4) yield C , Γ x : σ C , Γ x : σ C^('),Gamma|--x:sigmaC^{\prime}, \Gamma \vdash \mathrm{x}: \sigmaC,Γx:σ.
  • Cases HMX-ABS, HMX-App, HMX-LET. By the induction hypothesis and by HMX-ABS, HMX-APP, or HMX-LET, respectively.
  • Case HMX-GEn. The rule's conclusion is C x ¯ . D , Γ t : x ¯ [ D ] C x ¯ . D , Γ t : x ¯ [ D ] C^^EE bar(x).D,Gamma|--t:AA bar(x)[D]C \wedge \exists \overline{\mathrm{x}} . D, \Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{x}}[D]Cx¯.D,Γt:x¯[D].T. Its premises are C D , Γ t : T C D , Γ t : T C^^D,Gamma|--t:TC \wedge D, \Gamma \vdash \mathrm{t}: \mathrm{T}CD,Γt:T (1) and x ¯ # f t v ( C , Γ ) x ¯ # f t v ( C , Γ ) bar(x)#ftv(C,Gamma)\overline{\mathrm{x}} \# f t v(C, \Gamma)x¯#ftv(C,Γ) (2). By hypothesis, we have C C X ¯ . D C C X ¯ . D C^(')⊩C^^EE bar(X).DC^{\prime} \Vdash C \wedge \exists \overline{\mathrm{X}} . DCCX¯.D (3). We may assume, w.l.o.g., X ¯ # f t v ( C ) X ¯ # f t v C bar(X)#ftv(C^('))\overline{\mathrm{X}} \# \mathrm{ftv}\left(C^{\prime}\right)X¯#ftv(C) (4). Applying the induction hypothesis to (1) and to the entailment assertion C C D C D C C D C D C^(')^^C^^D⊩C^^DC^{\prime} \wedge C \wedge D \Vdash C \wedge DCCDCD, we obtain C C D , Γ t : T C C D , Γ t : T C^(')^^C^^D,Gamma|--t:TC^{\prime} \wedge C \wedge D, \Gamma \vdash \mathrm{t}: \mathrm{T}CCD,Γt:T (5). By HMX-GEN, applied to (5), (2) and (4), we get C C x ¯ . D , Γ t : x ¯ [ D ] . T C C x ¯ . D , Γ t : x ¯ [ D ] . T C^(')^^C^^EE bar(x).D,Gamma|--t:AA bar(x)[D].TC^{\prime} \wedge C \wedge \exists \overline{\mathrm{x}} . D, \Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{x}}[D] . \mathrm{T}CCx¯.D,Γt:x¯[D].T (6). By (3) and C-Dup, the constraints C C x ¯ . D C C x ¯ . D C^(')^^C^^EE bar(x).DC^{\prime} \wedge C \wedge \exists \overline{\mathrm{x}} . DCCx¯.D and C C C^(')C^{\prime}C are equivalent, so (6) is the goal C , Γ t : x ¯ [ D ] C , Γ t : x ¯ [ D ] C^('),Gamma|--t:AA bar(x)[D]C^{\prime}, \Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{x}}[D]C,Γt:x¯[D]. .T.
  • Case HmX-Inst. The rule's conclusion is C D , Γ t C D , Γ t C^^D,Gamma|--tC \wedge D, \Gamma \vdash \mathrm{t}CD,Γt : T. Its premise is C , Γ t : X ¯ [ D ] . T C , Γ t : X ¯ [ D ] . T C,Gamma|--t:AA bar(X)[D].TC, \Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}}[D] . \mathrm{T}C,Γt:X¯[D].T (1). By hypothesis, C C C^(')C^{\prime}C entails C D C D C^^DC \wedge DCD (2). Because (2) implies C C C C C^(')⊩CC^{\prime} \Vdash CCC, the induction hypothesis may be applied to (1), yielding C , Γ t : x ¯ [ D ] . T C , Γ t : x ¯ [ D ] . T C^('),Gamma|--t:AA bar(x)[D].TC^{\prime}, \Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{x}}[D] . \mathrm{T}C,Γt:x¯[D].T (3). By HmX-Inst, we obtain C D , Γ t : T C D , Γ t : T C^(')^^D,Gamma|--t:TC^{\prime} \wedge D, \Gamma \vdash \mathrm{t}: \mathrm{T}CD,Γt:T (4). Because (2) implies C C D C C D C^(')-=C^(')^^DC^{\prime} \equiv C^{\prime} \wedge DCCD, (4) is the goal C , Γ t : T C , Γ t : T C^('),Gamma|--t:TC^{\prime}, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T.
  • Case HmX-Sub. The rule's conclusion is C , Γ t : T C , Γ t : T C,Gamma|--t:T^(')C, \Gamma \vdash \mathrm{t}: \mathrm{T}^{\prime}C,Γt:T. Its premises are C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T (1) and C T T C T T C⊩T <= T^(')C \Vdash \mathrm{T} \leq \mathrm{T}^{\prime}CTT (2). By hypothesis, we have C C C C C^(')⊩CC^{\prime} \Vdash CCC (3). Applying the induction hypothesis to (1) and (3) yields C , Γ t : T C , Γ t : T C^('),Gamma|--t:TC^{\prime}, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T (4). By transitivity of entailment, (3) and (2) imply C T T C T T C^(')⊩T <= T^(')C^{\prime} \Vdash \mathrm{T} \leq \mathrm{T}^{\prime}CTT (5). By HMX-SuB, (4) and (5) yield C , Γ t : T C , Γ t : T C^('),Gamma|--t:T^(')C^{\prime}, \Gamma \vdash \mathrm{t}: \mathrm{T}^{\prime}C,Γt:T.
  • Case HmX-Exists. The rule's conclusion is x ¯ . C , Γ t : σ x ¯ . C , Γ t : σ EE bar(x).C,Gamma|--t:sigma\exists \overline{\mathrm{x}} . C, \Gamma \vdash \mathrm{t}: \sigmax¯.C,Γt:σ. Its premises are C , Γ t : σ ( 1 ) C , Γ t : σ ( 1 ) C,Gamma|--t:sigma(1)C, \Gamma \vdash \mathrm{t}: \sigma(\mathbf{1})C,Γt:σ(1) and x ¯ # f t v ( Γ , σ ) ( 2 ) x ¯ # f t v ( Γ , σ ) ( 2 ) bar(x)#ftv(Gamma,sigma)(2)\overline{\mathrm{x}} \# f t v(\Gamma, \sigma)(\mathbf{2})x¯#ftv(Γ,σ)(2). By hypothesis, we have C C C^(')⊩C^{\prime} \VdashC X ¯ . C X ¯ . C EE bar(X).C\exists \overline{\mathrm{X}} . CX¯.C (3). We may assume, w.l.o.g., x ¯ # f t v ( C ) x ¯ # f t v C bar(x)#ftv(C^('))\overline{\mathrm{x}} \# \mathrm{ftv}\left(C^{\prime}\right)x¯#ftv(C) (4). Applying the induction hypothesis to (1) and to the entailment assertion C C C C C C C^(')^^C⊩CC^{\prime} \wedge C \Vdash CCCC, we obtain C C , Γ t : σ C C , Γ t : σ C^(')^^C,Gamma|--t:sigmaC^{\prime} \wedge C, \Gamma \vdash \mathrm{t}: \sigmaCC,Γt:σ (5). By HmX-Exists, (5) and (2) yield X ¯ . ( C C ) , Γ X ¯ . C C , Γ EE bar(X).(C^(')^^C),Gamma|--\exists \overline{\mathrm{X}} .\left(C^{\prime} \wedge C\right), \Gamma \vdashX¯.(CC),Γ t : σ t : σ t:sigma\mathrm{t}: \sigmat:σ (6). By (4) and C-ExAnd, the constraint X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. ( C C ) C C (C^(')^^C)\left(C^{\prime} \wedge C\right)(CC) is equivalent to C X ¯ C X ¯ C^(')^^EE bar(X)C^{\prime} \wedge \exists \overline{\mathrm{X}}CX¯. C C CCC, which, by (3) and C-Dup, is equivalent to C C C^(')C^{\prime}C. Thus, (6) is the goal C , Γ t : σ C , Γ t : σ C^('),Gamma|--t:sigmaC^{\prime}, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ.
We do not give a direct type soundness proof for HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). Instead, in section 1.5 , we prove that it is equivalent to another type system, which later is itself proven sound. A direct type soundness result, based on a denotational semantics, may be found in (Odersky, Sulzmann, and Wehr, 1999a). Another type soundness proof, which follows Wright and Felleisen's syntactic approach (1994b), appears in (Skalka and Pottier, 2002). Last, a hybrid approach, which combines some of the advantages of the previous two, is given in (Pottier, 2001a).

An alternate presentation of H M ( X ) H M ( X ) HM(X)\mathrm{HM}(\boldsymbol{X})HM(X)

The presentation of H M ( X ) H M ( X ) HM(X)\mathrm{HM}(X)HM(X) given in Figure 1-7 has only four syntax-directed rules out of eight. It is a good specification of the type system, but it is far from an algorithmic description. As a first step towards such a description, we provide an alternate presentation of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), where generalization is performed only at let expressions and instantiation takes place only at references to program identifiers (Figure 1-8). It has the property that all judgements are of the form C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T, rather than C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ. The following theorem states that the two presentations are indeed equivalent.
1.4.3 Theorem: C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T is derivable via the rules of Figure 1-8 if and only if it is a valid HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) judgement.
Figure 1-8: An alternate presentation of H M ( X ) H M ( X ) HM(X)\mathbf{H M}(X)HM(X)
This theorem shows that the rule sets of Figures 1-7 and 1-8 derive the same monomorphic judgements, that is, the same judgements of the form C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T. The fact that judgements of the form C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ, where σ σ sigma\sigmaσ is a not a monotype, cannot be derived using the new rule set is a technical simplification, without deep significance; the first two exercises below shed some light on this issue.
1.4.4 EXERCISE [ ] [ ] [******][\star \star][] : Show that both rule sets lead to the same set of well-typed programs.
1.4.5 ExERcISE [ [ [******[\star \star[ ]: Show that, if HMX-GEN is added to the rule set of Figure 18 , then both rule sets derive exactly the same judgements.
1.4.6 EXERCISE [ , ] [ , ] [*********,↛][\star \star \star, \nrightarrow][,] : Show that it is possible to simplify the presentation of Damas and Milner's type system in an analogous manner. That is, define an alternate set of typing rules for DM, which allows deriving judgements of the form Γ t : T Γ t : T Gamma|--t:T\Gamma \vdash \mathrm{t}: \mathrm{T}Γt:T; then, show that this new rule set is equivalent to the previous one, in the same sense as above. Which auxiliary properties of DM does your proof require? A solution is given in (Clément, Despeyroux, Despeyroux, and Kahn, 1986).

Relating H M ( X ) H M ( X ) HM(X)\mathrm{HM}(X)HM(X) with Damas and Milner's type system

In order to explain our interest in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), we wish to show that it is more general than Damas and Milner's type system. Since HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) really is a family of type systems, we must make this statement more precise. First, every member of the H M ( X ) H M ( X ) HM(X)\mathrm{HM}(X)HM(X) family contains DM. Conversely, DM contains H M ( = ) H M ( = ) HM(=)\mathrm{HM}(=)HM(=), the
constraint-based type system obtained by specializing HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) to the setting of an equality-only syntactic model.
The first of these assertions is easy to prove, because the mapping from DM judgements to HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) judgements is essentially the identity: every valid DM judgement may be viewed as a valid HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) judgement under the trivial assumption true. This statement relies on the fact that the DM type scheme X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯. T T T\mathrm{T}T is identified with the constrained type scheme X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯ [true].T, so DM type schemes (resp. environments) form a subset of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) type schemes (resp. environments). Its proof is routine, except perhaps in the case of DM-INST, where it is shown how the effect of applying a substitution in DM is emulated by strengthening the current constraint in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X).
1.4.7 Theorem: If Γ t : S Γ t : S Gamma|--t:S\Gamma \vdash \mathrm{t}: \mathrm{S}Γt:S holds in D M D M DM\mathrm{DM}DM, then true, Γ t : S Γ t : S Gamma|--t:S\Gamma \vdash \mathrm{t}: \mathrm{S}Γt:S holds in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X).
Proof: The proof is by structural induction on a derivation of Γ t : S Γ t : S Gamma|--t:S\Gamma \vdash t: SΓt:S. In each proof case, we adopt the notations of Figure 1-3.
  • Case DM-VAR. The rule's conclusion is Γ x : S Γ x : S Gamma|--x:S\Gamma \vdash \mathrm{x}: \mathrm{S}Γx:S. Its premise is Γ ( x ) = Γ ( x ) = Gamma(x)=\Gamma(\mathrm{x})=Γ(x)= S S S\mathrm{S}S (1). By definition and by C E x C E x C-Ex**\mathrm{C}-\mathrm{Ex} *CEx, the constraint S S EES\exists \mathrm{S}S is equivalent to true. By applying HMX-VAR to (1) and to the assertion true \Vdash true, we obtain true, Γ x : S Γ x : S Gamma|--x:S\Gamma \vdash \mathrm{x}: \mathrm{S}Γx:S.
  • Cases DM-ABS, DM-ApP, DM-LET. By the induction hypothesis and by HMX-ABS, HMX-APP or HMX-LET, respectively.
@\circ Case DM-GEn. The rule's conclusion is Γ t : x ¯ Γ t : x ¯ Gamma|--t:AA bar(x)\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{x}}Γt:x¯.T. Its premises are Γ t : T ( 1 ) Γ t : T ( 1 ) Gamma|--t:T(1)\Gamma \vdash \mathrm{t}: \mathrm{T}(\mathbf{1})Γt:T(1) and X ¯ # f t v ( Γ ) X ¯ # f t v ( Γ ) bar(X)#ftv(Gamma)\overline{\mathrm{X}} \# \mathrm{ftv}(\Gamma)X¯#ftv(Γ) (2). Applying the induction hypothesis to (1) yields true, Γ t : T Γ t : T Gamma|--t:T\Gamma \vdash \mathrm{t}: \mathrm{T}Γt:T (3). Furthermore, (2) implies X ¯ # f t v ( t r u e , Γ ) X ¯ # f t v ( t r u e , Γ ) bar(X)#ftv(true,Gamma)\overline{\mathrm{X}} \# \mathrm{ftv}(\mathrm{true}, \Gamma)X¯#ftv(true,Γ) (4). By HMX-GEN, (3) and (4) yield true, Γ t : x ¯ [ Γ t : x ¯ [ Gamma|--t:AA bar(x)[\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{x}}[Γt:x¯[ true].T.
  • Case Dm-Inst. The rule's conclusion is Γ t : [ x T ] T Γ t : [ x T ] T Gamma|--t:[ vec(x)|-> vec(T)]T\Gamma \vdash \mathrm{t}:[\overrightarrow{\mathrm{x}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}Γt:[xT]T. Its premise is Γ t : X ¯ . T Γ t : X ¯ . T Gamma|--t:AA bar(X).T\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T}Γt:X¯.T (1). We may assume, w.l.o.g., X ¯ # f t v ( Γ , T ¯ ) X ¯ # f t v ( Γ , T ¯ ) bar(X)#ftv(Gamma, bar(T))\overline{\mathrm{X}} \# \mathrm{ftv}(\Gamma, \overline{\mathrm{T}})X¯#ftv(Γ,T¯) (2). Applying the induction hypothesis to (1) yields true, Γ t : X ¯ [ Γ t : X ¯ [ Gamma|--t:AA bar(X)[\Gamma \vdash \mathrm{t}: \forall \overline{\mathrm{X}}[Γt:X¯[ true].T (3). By HMXInst, (3) implies true, Γ t : T Γ t : T Gamma|--t:T\Gamma \vdash \mathrm{t}: \mathrm{T}Γt:T (4). By Lemma 1.4.2, we may weaken this judgement so as to obtain x = T , Γ t : T x = T , Γ t : T vec(x)= vec(T),Gamma|--t:T\overrightarrow{\mathrm{x}}=\overrightarrow{\mathrm{T}}, \Gamma \vdash \mathrm{t}: \mathrm{T}x=T,Γt:T (5). Using C-EQ, C-ExTrans, and C-ExAnd, it is possible to establish X = T T = [ X T ] T X = T T = [ X T ] T vec(X)= vec(T)⊩T=[ vec(X)|-> vec(T)]T\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}} \Vdash \mathrm{T}=[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}X=TT=[XT]T (6). Applying HMX-SuB to (5) and (6), we find x = T , Γ t : [ X T ] T x = T , Γ t : [ X T ] T vec(x)= vec(T),Gamma|--t:[ vec(X)|-> vec(T)]T\overrightarrow{\mathrm{x}}=\overrightarrow{\mathrm{T}}, \Gamma \vdash \mathrm{t}:[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}x=T,Γt:[XT]T (7). Last, (2) implies X ¯ # f t v ( Γ , [ X T ] T ) X ¯ # f t v ( Γ , [ X T ] T ) bar(X)#ftv(Gamma,[ vec(X)|-> vec(T)]T)\overline{\mathrm{X}} \# f t v(\Gamma,[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T})X¯#ftv(Γ,[XT]T) (8). Applying HMX-EXisTs to (7) and (8), we obtain X ¯ . ( X = T ) , Γ t : [ X T ] T X ¯ . ( X = T ) , Γ t : [ X T ] T EE bar(X).( vec(X)= vec(T)),Gamma|--t:[ vec(X)|-> vec(T)]T\exists \overline{\mathrm{X}} .(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}), \Gamma \vdash \mathrm{t}:[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}X¯.(X=T),Γt:[XT]T (9). By (2) and C-NAME, the constraint X ¯ . ( X = T ) X ¯ . ( X = T ) EE bar(X).( vec(X)= vec(T))\exists \overline{\mathrm{X}} .(\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}})X¯.(X=T) is equivalent to true, so (9) is the goal.
We are now interested in proving that H M ( = ) H M ( = ) HM(=)\mathrm{HM}(=)HM(=), as defined above, is contained within DM. To this end, we must translate every H M ( = ) H M ( = ) HM(=)\mathrm{HM}(=)HM(=) judgement to a DM judgement. It quickly turns out that this is possible if the original judgement's constraint assumption is satisfiable.
We begin by explaining how an H M ( = ) H M ( = ) HM(=)\mathrm{HM}(=)HM(=) is translated into a DM type scheme. Such a translation is made possible by the fact that the definition of HM ( = ) HM ( = ) HM(=)\operatorname{HM}(=)HM(=) assumes an equality-only syntactic model. Indeed, in that setting, every satisfiable constraint admits a most general unifier (Definition 1.3.41), whose properties we make essential use of.
In fact, we must not only translate a type scheme, but also apply a type substitution to it. Instead of separating these steps, we perform both at once, and parameterize the translation by a type substitution θ θ theta\thetaθ. (It does not appear that separating them would help.) The definition of [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ is somewhat involved: it is given in the statement of the following lemma, whose proof establishes that the definition is indeed well-formed.
1.4.8 Lemma: Consider a type scheme σ σ sigma\sigmaσ and an idempotent type substitution θ θ theta\thetaθ such that f t v ( σ ) dom ( θ ) f t v ( σ ) dom ( θ ) ftv(sigma)sube dom(theta)f t v(\sigma) \subseteq \operatorname{dom}(\theta)ftv(σ)dom(θ) (1) and θ σ θ σ EE theta⊩EE sigma\exists \theta \Vdash \exists \sigmaθσ (2). Write σ = x ¯ [ D ] σ = x ¯ [ D ] sigma=AA bar(x)[D]\sigma=\forall \overline{\mathrm{x}}[D]σ=x¯[D].T, where x ¯ # θ x ¯ # θ bar(x)#theta\overline{\mathrm{x}} \# \thetax¯#θ (3). Then, there exists a type substitution θ θ theta^(')\theta^{\prime}θ such that θ θ theta^(')\theta^{\prime}θ extends θ , dom ( θ ) θ , dom θ theta,dom(theta^('))\theta, \operatorname{dom}\left(\theta^{\prime}\right)θ,dom(θ) is dom ( θ ) X ¯ dom ( θ ) X ¯ dom(theta)uu bar(X)\operatorname{dom}(\theta) \cup \overline{\mathrm{X}}dom(θ)X¯, and θ θ theta^(')\theta^{\prime}θ is a most general unifier of θ D θ D EE theta^^D\exists \theta \wedge DθD. Let Y ¯ = f tv ( θ ( X ¯ ) ) range ( θ ) Y ¯ = f tv θ ( X ¯ ) range ( θ ) bar(Y)=f tv(theta^(')( bar(X)))\\range(theta)\overline{\mathrm{Y}}=f \operatorname{tv}\left(\theta^{\prime}(\overline{\mathrm{X}})\right) \backslash \operatorname{range}(\theta)Y¯=ftv(θ(X¯))range(θ). Then, the translation of σ σ sigma\sigmaσ under θ θ theta\thetaθ, written [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ, is the DM type scheme Y ¯ θ ( T ) Y ¯ θ ( T ) AA bar(Y)*theta^(')(T)\forall \overline{\mathrm{Y}} \cdot \theta^{\prime}(\mathrm{T})Y¯θ(T). This is a well-formed definition. Furthermore, f t v ( [ [ σ ] ] θ ) range ( θ ) f t v [ [ σ ] ] θ range ( θ ) ftv(([[)sigma]]_(theta))sube range(theta)f t v\left(\llbracket \sigma \rrbracket_{\theta}\right) \subseteq \operatorname{range}(\theta)ftv([[σ]]θ)range(θ) holds.
Proof: By (2), θ θ EE theta\exists \thetaθ is equivalent to θ σ θ σ EE theta^^EE sigma\exists \theta \wedge \exists \sigmaθσ, which may be written θ X ¯ . D θ X ¯ . D EE theta^^EE bar(X).D\exists \theta \wedge \exists \overline{\mathrm{X}} . DθX¯.D. By (3) and C-ExAnd, this is x ¯ x ¯ EE bar(x)\exists \overline{\mathrm{x}}x¯. ( θ D ) ( θ D ) (EE theta^^D)(\exists \theta \wedge D)(θD). Thus, because θ θ theta\thetaθ is a most general unifier of θ , θ θ , θ EE theta,theta\exists \theta, \thetaθ,θ is also a most general unifier of X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. ( θ D ) ( θ D ) (EE theta^^D)(\exists \theta \wedge D)(θD) (4). Furthermore, ftv ( X ¯ . ( θ D ) ) ftv ( X ¯ . ( θ D ) ) ftv(EE bar(X).(EE theta^^D))\operatorname{ftv}(\exists \overline{\mathrm{X}} .(\exists \theta \wedge D))ftv(X¯.(θD)) is ftv ( θ σ ) ftv ( θ σ ) ftv(EE theta^^EE sigma)\operatorname{ftv}(\exists \theta \wedge \exists \sigma)ftv(θσ), which by definition of θ θ EE theta\exists \thetaθ and by (1) is a subset of dom ( θ ) ( 5 ) dom ( θ ) ( 5 ) dom(theta)(5)\operatorname{dom}(\theta)(\mathbf{5})dom(θ)(5). By (4), (3), (5), and Lemma 1.3.45, there exists a type substitution θ θ theta^(')\theta^{\prime}θ such that θ θ theta^(')\theta^{\prime}θ extends θ ( 6 ) θ ( 6 ) theta(6)\theta(6)θ(6) and θ θ theta^(')\theta^{\prime}θ is a most general unifier of θ D ( 7 ) θ D ( 7 ) EE theta^^D(7)\exists \theta \wedge D(\mathbf{7})θD(7) and dom ( θ ) = dom ( θ ) X ¯ ( 8 ) dom θ = dom ( θ ) X ¯ ( 8 ) dom(theta^('))=dom(theta)uu bar(X)(8)\operatorname{dom}\left(\theta^{\prime}\right)=\operatorname{dom}(\theta) \cup \overline{\mathrm{X}}(\mathbf{8})dom(θ)=dom(θ)X¯(8).
Let us now define Y ¯ = f t v ( θ ( X ¯ ) ) range ( θ ) Y ¯ = f t v θ ( X ¯ ) range ( θ ) bar(Y)=ftv(theta^(')( bar(X)))\\range(theta)\overline{\mathrm{Y}}=f t v\left(\theta^{\prime}(\overline{\mathrm{X}})\right) \backslash \operatorname{range}(\theta)Y¯=ftv(θ(X¯))range(θ) and [ [ σ ] ] θ = Y ¯ θ ( T ) [ [ σ ] ] θ = Y ¯ θ ( T ) [[sigma]]_(theta)=AA bar(Y)*theta^(')(T)\llbracket \sigma \rrbracket_{\theta}=\forall \overline{\mathrm{Y}} \cdot \theta^{\prime}(\mathrm{T})[[σ]]θ=Y¯θ(T). By (1), we have f t v ( T ) X ¯ dom ( θ ) f t v ( T ) X ¯ dom ( θ ) ftv(T)sube bar(X)uu dom(theta)f t v(\mathrm{~T}) \subseteq \overline{\mathrm{X}} \cup \operatorname{dom}(\theta)ftv( T)X¯dom(θ). Applying θ θ theta^(')\theta^{\prime}θ and exploiting (6), we find f t v ( θ ( T ) ) f t v θ ( T ) ftv(theta^(')(T))subef t v\left(\theta^{\prime}(\mathrm{T})\right) \subseteqftv(θ(T)) f t v ( θ ( X ¯ ) ) range ( θ ) f t v θ ( X ¯ ) range ( θ ) ftv(theta^(')( bar(X)))uu range(theta)f t v\left(\theta^{\prime}(\overline{\mathrm{X}})\right) \cup \operatorname{range}(\theta)ftv(θ(X¯))range(θ), which by definition of Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ may be written f t v ( θ ( T ) ) f t v θ ( T ) ftv(theta^(')(T))subef t v\left(\theta^{\prime}(\mathrm{T})\right) \subseteqftv(θ(T)) Y ¯ Y ¯ bar(Y)uu\overline{\mathrm{Y}} \cupY¯ range ( θ ) ( θ ) (theta)(\theta)(θ). Subtracting Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ on each side, we find ftv ( [ [ σ ] ] θ ) range ( θ ) ( 9 ) ftv [ [ σ ] ] θ range ( θ ) ( 9 ) ftv(([[)sigma]]_(theta))sube range(theta)(9)\operatorname{ftv}\left(\llbracket \sigma \rrbracket_{\theta}\right) \subseteq \operatorname{range}(\theta) \mathbf{( 9 )}ftv([[σ]]θ)range(θ)(9).
To show that the definition of [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ is valid, there remains to show that it does not depend on the choice of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ or θ θ theta^(')\theta^{\prime}θ. To prove the former, it suffices to establish X ¯ # f t v ( [ [ σ ] ] θ ) X ¯ # f t v [ [ σ ] ] θ bar(X)#ftv(([[)sigma]]_(theta))\overline{\mathrm{X}} \# \mathrm{ftv}\left(\llbracket \sigma \rrbracket_{\theta}\right)X¯#ftv([[σ]]θ), which indeed follows from (3) and (9). As for the latter, because of the constraints imposed by (6), (7), and (8), and by Lemma 1.3.44, distinct choices of θ θ theta^(')\theta^{\prime}θ may differ only by a renaming of f t v ( θ ( X ¯ ) ) range ( θ ) f t v θ ( X ¯ ) range ( θ ) ftv(theta^(')( bar(X)))\\range(theta)f t v\left(\theta^{\prime}(\overline{\mathrm{X}})\right) \backslash \operatorname{range}(\theta)ftv(θ(X¯))range(θ), that is, Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯. So, we must check Y ¯ # f t v ( [ [ σ ] ] θ ) Y ¯ # f t v [ [ σ ] ] θ bar(Y)#ftv(([[)sigma]]_(theta))\overline{\mathrm{Y}} \# \mathrm{ftv}\left(\llbracket \sigma \rrbracket_{\theta}\right)Y¯#ftv([[σ]]θ), which holds by definition.
Please note that if σ σ sigma\sigmaσ is in fact a type T T T\mathrm{T}T, where f t v ( T ) dom ( θ ) f t v ( T ) dom ( θ ) ftv(T)sube dom(theta)f t v(\mathrm{~T}) \subseteq \operatorname{dom}(\theta)ftv( T)dom(θ), then X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ is empty, so θ θ theta^(')\theta^{\prime}θ is θ , Y ¯ θ , Y ¯ theta, bar(Y)\theta, \overline{\mathrm{Y}}θ,Y¯ is empty, and [ [ T ] ] θ = θ ( T ) [ [ T ] ] θ = θ ( T ) [[T]]_(theta)=theta(T)\llbracket \mathrm{T} \rrbracket_{\theta}=\theta(\mathrm{T})[[T]]θ=θ(T). In other words, the translation of a type under θ θ theta\thetaθ is its image through θ θ theta\thetaθ. More generally, the translation of an unconstrained type scheme (that is, a type scheme whose constraint is true) is its image through θ θ theta\thetaθ, as stated by the following exercise.
1.4.9 EXERCISE [ , ] [ , ] [******,↛][\star \star, \nrightarrow][,] : Prove that [ [ X ¯ . T ] ] θ [ [ X ¯ . T ] ] θ [[AA bar(X).T]]_(theta)\llbracket \forall \overline{\mathrm{X}} . \mathrm{T} \rrbracket_{\theta}[[X¯.T]]θ, when defined, is θ ( X ¯ . T ) θ ( X ¯ . T ) theta(AA bar(X).T)\theta(\forall \overline{\mathrm{X}} . \mathrm{T})θ(X¯.T).
The translation becomes more than a mere type substitution when applied to a nontrivial constrained type scheme. Some examples of this situation are given below.
1.4.10 Example: Let σ = X Y [ X = Y Y ] σ = X Y [ X = Y Y ] sigma=AAXY[X=YrarrY]\sigma=\forall \mathrm{XY}[\mathrm{X}=\mathrm{Y} \rightarrow \mathrm{Y}]σ=XY[X=YY].X. Let θ θ theta\thetaθ be the identity substitution. The type scheme σ σ sigma\sigmaσ is closed and the constraint σ σ EE sigma\exists \sigmaσ is equivalent to true, so [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ is defined. We must find a type substitution θ θ theta^(')\theta^{\prime}θ whose domain is XY and that is a most general unifier of X = Y Y X = Y Y X=Y rarr YX=Y \rightarrow YX=YY. All such substitutions are of the form [ X ( Z Z ) , Y Z ] [ X ( Z Z ) , Y Z ] [X|->(ZrarrZ),Y|->Z][\mathrm{X} \mapsto(\mathrm{Z} \rightarrow \mathrm{Z}), \mathrm{Y} \mapsto \mathrm{Z}][X(ZZ),YZ], where Z Z Z\mathrm{Z}Z is fresh. We have f t v ( θ ( X Y ) ) = Z f t v θ ( X Y ) = Z ftv(theta^(')(XY))=Zf t v\left(\theta^{\prime}(\mathrm{XY})\right)=\mathrm{Z}ftv(θ(XY))=Z, whence [ [ σ ] ] θ = Z . Z Z [ [ σ ] ] θ = Z . Z Z [[sigma]]_(theta)=AAZ.ZrarrZ\llbracket \sigma \rrbracket_{\theta}=\forall \mathrm{Z} . \mathrm{Z} \rightarrow \mathrm{Z}[[σ]]θ=Z.ZZ. Note that the choice of Z Z Z\mathrm{Z}Z does not matter, since it is bound in [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ. Roughly speaking, the effect of the translation was to replace the body X X X\mathrm{X}X of the constrained type scheme with its most general solution under the constraint X = Y Y X = Y Y X=YrarrY\mathrm{X}=\mathrm{Y} \rightarrow \mathrm{Y}X=YY.
Let σ = X Y 1 [ X = Y 1 Y 2 ] σ = X Y 1 X = Y 1 Y 2 sigma=AAXY_(1)[X=Y_(1)rarrY_(2)]\sigma=\forall \mathrm{XY}_{1}\left[\mathrm{X}=\mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2}\right]σ=XY1[X=Y1Y2].X. Let θ = [ Y 2 Z 2 ] θ = Y 2 Z 2 theta=[Y_(2)|->Z_(2)]\theta=\left[\mathrm{Y}_{2} \mapsto \mathrm{Z}_{2}\right]θ=[Y2Z2]. We have f t v ( σ ) = f t v ( σ ) = ftv(sigma)=f t v(\sigma)=ftv(σ)= Y 2 dom ( θ ) Y 2 dom ( θ ) Y_(2)sube dom(theta)\mathrm{Y}_{2} \subseteq \operatorname{dom}(\theta)Y2dom(θ). The constraint σ σ EE sigma\exists \sigmaσ is equivalent to true, so [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ is defined. We must find a type substitution θ θ theta^(')\theta^{\prime}θ whose domain is X Y 1 Y 2 X Y 1 Y 2 XY_(1)Y_(2)\mathrm{XY}_{1} \mathrm{Y}_{2}XY1Y2 that extends θ θ theta\thetaθ and that is a most general unifier of X = Y 1 Y 2 X = Y 1 Y 2 X=Y_(1)rarrY_(2)\mathrm{X}=\mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2}X=Y1Y2. All such substitutions are of the form [ X ( Z 1 Z 2 ) , Y 1 Z 1 , Y 2 Z 2 ] X Z 1 Z 2 , Y 1 Z 1 , Y 2 Z 2 [X|->(Z_(1)rarrZ_(2)),Y_(1)|->Z_(1),Y_(2)|->Z_(2)]\left[\mathrm{X} \mapsto\left(\mathrm{Z}_{1} \rightarrow \mathrm{Z}_{2}\right), \mathrm{Y}_{1} \mapsto \mathrm{Z}_{1}, \mathrm{Y}_{2} \mapsto \mathrm{Z}_{2}\right][X(Z1Z2),Y1Z1,Y2Z2], where Z 1 Z 1 Z_(1)\mathrm{Z}_{1}Z1 is fresh. We have ftv ( θ ( X Y 1 ) ) range ( θ ) = Z 1 Z 2 Z 2 = Z 1 ftv θ X Y 1 range ( θ ) = Z 1 Z 2 Z 2 = Z 1 ftv(theta^(')(XY_(1)))\\range(theta)=Z_(1)Z_(2)\\Z_(2)=Z_(1)\operatorname{ftv}\left(\theta^{\prime}\left(\mathrm{XY}_{1}\right)\right) \backslash \operatorname{range}(\theta)=\mathrm{Z}_{1} \mathrm{Z}_{2} \backslash \mathrm{Z}_{2}=\mathrm{Z}_{1}ftv(θ(XY1))range(θ)=Z1Z2Z2=Z1, whence [ [ σ ] ] θ = Z 1 Z 1 Z 2 [ [ σ ] ] θ = Z 1 Z 1 Z 2 [[sigma]]_(theta)=AAZ_(1)*Z_(1)rarrZ_(2)\llbracket \sigma \rrbracket_{\theta}=\forall \mathrm{Z}_{1} \cdot \mathrm{Z}_{1} \rightarrow \mathrm{Z}_{2}[[σ]]θ=Z1Z1Z2. The type variable Z 2 Z 2 Z_(2)\mathrm{Z}_{2}Z2 is not universally quantified - even though it appears in the image of X X X\mathrm{X}X, which was universally quantified in σ σ sigma\sigmaσ-because Z 2 Z 2 Z_(2)\mathrm{Z}_{2}Z2 is the image of Y 2 Y 2 Y_(2)\mathrm{Y}_{2}Y2, which was free in σ σ sigma\sigmaσ.
Before attacking the main theorem, let us establish a couple of technical properties of the translation. First, [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ is insensitive to the behavior of θ θ theta\thetaθ outside ftv ( σ ) ftv ( σ ) ftv(sigma)\operatorname{ftv}(\sigma)ftv(σ), a natural property, since our informal intent is for θ θ theta\thetaθ to be applied to σ σ sigma\sigmaσ.
1.4.11 Lemma: If θ 1 θ 1 theta_(1)\theta_{1}θ1 and θ 2 θ 2 theta_(2)\theta_{2}θ2 coincide on f t v ( σ ) f t v ( σ ) ftv(sigma)f t v(\sigma)ftv(σ), then [ [ σ ] ] θ 1 [ [ σ ] ] θ 1 [[sigma]]_(theta_(1))\llbracket \sigma \rrbracket_{\theta_{1}}[[σ]]θ1 and [ [ σ ] ] θ 2 [ [ σ ] ] θ 2 [[sigma]]_(theta_(2))\llbracket \sigma \rrbracket_{\theta_{2}}[[σ]]θ2 are either both undefined, or both defined and identical.
Second, if C σ T C σ T C⊩sigma-<=T^(')C \Vdash \sigma \preceq \mathrm{T}^{\prime}CσT holds, then the translations of σ σ sigma\sigmaσ and T T T^(')\mathrm{T}^{\prime}T under a most general unifier of C C CCC are in Damas and Milner's instance relation. One might say, roughly speaking, that the instance relation is preserved by the translation.
1.4.12 Lemma: Let f t v ( σ , T ) dom ( θ ) f t v σ , T dom ( θ ) ftv(sigma,T^('))sube dom(theta)f t v\left(\sigma, \mathrm{T}^{\prime}\right) \subseteq \operatorname{dom}(\theta)ftv(σ,T)dom(θ) (1) and θ σ θ σ EE theta⊩EE sigma\exists \theta \Vdash \exists \sigmaθσ (2). Let θ σ T θ σ T EE theta⊩sigma-<=T^(')\exists \theta \Vdash \sigma \preceq \mathrm{T}^{\prime}θσT (3). Then, θ ( T ) θ T theta(T^('))\theta\left(\mathrm{T}^{\prime}\right)θ(T) is an instance of the DM type scheme [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ.
Proof: Write σ = X ¯ [ D ] . T σ = X ¯ [ D ] . T sigma=AA bar(X)[D].T\sigma=\forall \overline{\mathrm{X}}[D] . \mathrm{T}σ=X¯[D].T, where X ¯ # θ X ¯ # θ bar(X)#theta\overline{\mathrm{X}} \# \thetaX¯#θ (4) and X ¯ # f t v ( T ) X ¯ # f t v T bar(X)#ftv(T^('))\overline{\mathrm{X}} \# f t v\left(\mathrm{~T}^{\prime}\right)X¯#ftv( T) (5). By (1), ( 2 ) ( 2 ) (2)(2)(2), and (4), one may define θ , Y ¯ θ , Y ¯ theta^('), bar(Y)\theta^{\prime}, \overline{\mathrm{Y}}θ,Y¯, and [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ exactly as in the statement of Lemma 1.4.8. By (5) and Definition 1.3.3, (3) is synonymous with θ X ¯ θ X ¯ EE theta⊩EE bar(X)\exists \theta \Vdash \exists \overline{\mathrm{X}}θX¯.( D D D^^D \wedgeD
T = T T = T T=T^(')\mathrm{T}=\mathrm{T}^{\prime}T=T ). Reasoning in the same manner as in the first paragraph of the proof of Lemma 1.4.8, we find that there exists a type substitution θ θ theta^('')\theta^{\prime \prime}θ such that θ θ theta^('')\theta^{\prime \prime}θ extends θ , dom ( θ ) θ , dom θ theta,dom(theta^(''))\theta, \operatorname{dom}\left(\theta^{\prime \prime}\right)θ,dom(θ) is dom ( θ ) X ¯ dom ( θ ) X ¯ dom(theta)uu bar(X)\operatorname{dom}(\theta) \cup \overline{\mathrm{X}}dom(θ)X¯, and θ θ theta^('')\theta^{\prime \prime}θ is a most general unifier of θ D T = T θ D T = T EE theta^^D^^T=T^(')\exists \theta \wedge D \wedge \mathrm{T}=\mathrm{T}^{\prime}θDT=T.
We have dom ( θ ) = dom ( θ ) ( 6 ) dom θ = dom θ ( 6 ) dom(theta^('))=dom(theta^(''))(6)\operatorname{dom}\left(\theta^{\prime}\right)=\operatorname{dom}\left(\theta^{\prime \prime}\right)(\mathbf{6})dom(θ)=dom(θ)(6). Furthermore, θ θ theta^(')\theta^{\prime}θ is a most general unifier of θ D θ D EE theta^^D\exists \theta \wedge DθD, while θ θ theta^('')\theta^{\prime \prime}θ is a most general unifier of θ D T = T θ D T = T EE theta^^D^^T=T^(')\exists \theta \wedge D \wedge \mathrm{T}=\mathrm{T}^{\prime}θDT=T, which implies θ θ ( 7 ) θ θ ( 7 ) EEtheta^('')⊩EEtheta^(')(7)\exists \theta^{\prime \prime} \Vdash \exists \theta^{\prime}(\mathbf{7})θθ(7). By Lemma 1.3.39, θ θ theta^('')\theta^{\prime \prime}θ refines θ θ theta^(')\theta^{\prime}θ. That is, there exists a type substitution φ φ varphi\varphiφ such that θ θ theta^('')\theta^{\prime \prime}θ is the restriction of φ θ φ θ varphi@theta^(')\varphi \circ \theta^{\prime}φθ to dom ( θ ) x ¯ ( 8 ) dom ( θ ) x ¯ ( 8 ) dom(theta)uu bar(x)(8)\operatorname{dom}(\theta) \cup \overline{\mathrm{x}}(\mathbf{8})dom(θ)x¯(8). We may require dom ( φ ) range ( θ ) f t v ( θ ( X ¯ ) ) ( 9 ) dom ( φ ) range ( θ ) f t v θ ( X ¯ ) ( 9 ) dom(varphi)sube range(theta)uu ftv(theta^(')( bar(X)))(9)\operatorname{dom}(\varphi) \subseteq \operatorname{range}(\theta) \cup f t v\left(\theta^{\prime}(\overline{\mathrm{X}})\right)(\mathbf{9})dom(φ)range(θ)ftv(θ(X¯))(9) without compromising (8).
Consider X dom ( θ ) X dom ( θ ) Xin dom(theta)\mathrm{X} \in \operatorname{dom}(\theta)Xdom(θ). Because θ θ theta^('')\theta^{\prime \prime}θ extends θ θ theta\thetaθ, we have θ ( X ) = θ ( X ) ( 1 0 ) θ ( X ) = θ ( X ) ( 1 0 ) theta^('')(X)=theta(X)(10)\theta^{\prime \prime}(\mathrm{X})=\theta(\mathrm{X})(\mathbf{1 0})θ(X)=θ(X)(10). Furthermore, by (8), we have θ ( X ) = ( φ θ ) ( x ) = ( φ θ ) ( X ) ( 1 1 ) θ ( X ) = φ θ ( x ) = ( φ θ ) ( X ) ( 1 1 ) theta^('')(X)=(varphi@theta^('))(x)=(varphi@theta)(X)(11)\theta^{\prime \prime}(\mathrm{X})=\left(\varphi \circ \theta^{\prime}\right)(\mathrm{x})=(\varphi \circ \theta)(\mathrm{X})(\mathbf{1 1})θ(X)=(φθ)(x)=(φθ)(X)(11). Using (10) and (11), we find θ ( X ) = φ ( θ ( X ) ) θ ( X ) = φ ( θ ( X ) ) theta(X)=varphi(theta(X))\theta(\mathrm{X})=\varphi(\theta(\mathrm{X}))θ(X)=φ(θ(X)). Because this holds for every X dom ( θ ) X dom ( θ ) Xin dom(theta)\mathrm{X} \in \operatorname{dom}(\theta)Xdom(θ), φ φ varphi\varphiφ must be the identity over range ( θ ) ( θ ) (theta)(\theta)(θ); that is, dom ( φ ) dom ( φ ) dom(varphi)\operatorname{dom}(\varphi)dom(φ) # range ( θ ) ( θ ) (theta)(\theta)(θ) (12) holds. Combining (9) and (12), we find dom ( φ ) f t v ( θ ( X ¯ ) ) range ( θ ) dom ( φ ) f t v θ ( X ¯ ) range ( θ ) dom(varphi)sube ftv(theta^(')( bar(X)))\\range(theta)\operatorname{dom}(\varphi) \subseteq f t v\left(\theta^{\prime}(\overline{\mathrm{X}})\right) \backslash \operatorname{range}(\theta)dom(φ)ftv(θ(X¯))range(θ), that is, dom ( φ ) Y ¯ ( 1 3 ) dom ( φ ) Y ¯ ( 1 3 ) dom(varphi)sube bar(Y)(13)\operatorname{dom}(\varphi) \subseteq \overline{\mathrm{Y}}(\mathbf{1 3})dom(φ)Y¯(13).
By construction of θ θ theta^('')\theta^{\prime \prime}θ, we have θ T = T θ T = T EEtheta^('')⊩T=T^(')\exists \theta^{\prime \prime} \Vdash \mathrm{T}=\mathrm{T}^{\prime}θT=T. By Lemma 1.3.29, this implies θ ( θ ) θ ( T ) = θ ( T ) θ θ θ ( T ) = θ T theta^('')(EEtheta^(''))⊩theta^('')(T)=theta^('')(T^('))\theta^{\prime \prime}\left(\exists \theta^{\prime \prime}\right) \Vdash \theta^{\prime \prime}(\mathrm{T})=\theta^{\prime \prime}\left(\mathrm{T}^{\prime}\right)θ(θ)θ(T)=θ(T), which by Lemma 1.3.42 may be read true θ ( T ) = θ ( T ) = ⊩theta^('')(T)=\Vdash \theta^{\prime \prime}(\mathrm{T})=θ(T)= θ ( T ) θ T theta^('')(T^('))\theta^{\prime \prime}\left(\mathrm{T}^{\prime}\right)θ(T). By Lemma 1.3.32, θ ( T ) θ ( T ) theta^('')(T)\theta^{\prime \prime}(\mathrm{T})θ(T) and θ ( T ) θ T theta^('')(T^('))\theta^{\prime \prime}\left(\mathrm{T}^{\prime}\right)θ(T) coincide. Because by (1) f t v ( T ) f t v ( T ) ftv(T)f t v(\mathrm{~T})ftv( T) is a subset of dom ( θ ) X ¯ dom ( θ ) X ¯ dom(theta)uu bar(X)\operatorname{dom}(\theta) \cup \overline{\mathrm{X}}dom(θ)X¯ and by (8), the former may be written φ ( θ ( T ) ) φ θ ( T ) varphi(theta^(')(T))\varphi\left(\theta^{\prime}(\mathrm{T})\right)φ(θ(T)). By (1) and because θ θ theta^('')\theta^{\prime \prime}θ extends θ θ theta\thetaθ, the latter is θ ( T ) θ T theta(T^('))\theta\left(\mathrm{T}^{\prime}\right)θ(T). Thus, we have φ ( θ ( T ) ) = θ ( T ) φ θ ( T ) = θ T varphi(theta^(')(T))=theta(T^('))\varphi\left(\theta^{\prime}(\mathrm{T})\right)=\theta\left(\mathrm{T}^{\prime}\right)φ(θ(T))=θ(T). Together with (13), this establishes that θ ( T ) θ T theta(T^('))\theta\left(\mathrm{T}^{\prime}\right)θ(T) is an instance of Y ¯ . θ ( T ) Y ¯ . θ ( T ) AA bar(Y).theta^(')(T)\forall \overline{\mathrm{Y}} . \theta^{\prime}(\mathrm{T})Y¯.θ(T), that is, [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ.
We extend the translation to environments as follows. [ [ ] ] θ [ [ ] ] θ [[O/]]_(theta)\llbracket \varnothing \rrbracket_{\theta}[[]]θ is O/\varnothing. If θ σ θ σ EE theta⊩EE sigma\exists \theta \Vdash \exists \sigmaθσ holds, then [ [ Γ ; x : σ ] ] θ [ [ Γ ; x : σ ] ] θ [[Gamma;x:sigma]]_(theta)\llbracket \Gamma ; \mathrm{x}: \sigma \rrbracket_{\theta}[[Γ;x:σ]]θ is [ [ Γ ] ] θ ; x : [ [ σ ] ] θ [ [ Γ ] ] θ ; x : [ [ σ ] ] θ [[Gamma]]_(theta);x:[[sigma]]_(theta)\llbracket \Gamma \rrbracket_{\theta} ; \mathrm{x}: \llbracket \sigma \rrbracket_{\theta}[[Γ]]θ;x:[[σ]]θ, otherwise it is [ [ Γ ] ] θ [ [ Γ ] ] θ [[Gamma]]_(theta)\llbracket \Gamma \rrbracket_{\theta}[[Γ]]θ. Notice that [ [ Γ ] ] θ [ [ Γ ] ] θ [[Gamma]]_(theta)\llbracket \Gamma \rrbracket_{\theta}[[Γ]]θ contains fewer bindings than Γ Γ Gamma\GammaΓ, which ensures that bindings x : σ x : σ x:sigmax: \sigmax:σ for which θ σ θ σ EE theta⊩EE sigma\exists \theta \Vdash \exists \sigmaθσ does not hold will not be used in the translation. Please note that [ [ Γ ] ] θ [ [ Γ ] ] θ [[Gamma]]_(theta)\llbracket \Gamma \rrbracket_{\theta}[[Γ]]θ is defined when f t v ( Γ ) dom ( θ ) f t v ( Γ ) dom ( θ ) ftv(Gamma)sube dom(theta)f t v(\Gamma) \subseteq \operatorname{dom}(\theta)ftv(Γ)dom(θ) holds.
We are now ready to prove the main theorem. Please note that, by requiring θ θ theta\thetaθ to be a most general unifier of C C CCC, we also require C C CCC to be satisfiable. Judgements that carry an unsatisfiable constraint cannot be translated.
1.4.13 Theorem: Let C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ hold in HM ( = ) HM ( = ) HM(=)\operatorname{HM}(=)HM(=). Let θ θ theta\thetaθ be a most general unifier of C C CCC such that f t v ( Γ , σ ) dom ( θ ) f t v ( Γ , σ ) dom ( θ ) ftv(Gamma,sigma)sube dom(theta)f t v(\Gamma, \sigma) \subseteq \operatorname{dom}(\theta)ftv(Γ,σ)dom(θ). Then, [ [ Γ ] ] θ t : [ [ σ ] ] θ [ [ Γ ] ] θ t : [ [ σ ] ] θ [[Gamma]]_(theta)|--t:[[sigma]]_(theta)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \llbracket \sigma \rrbracket_{\theta}[[Γ]]θt:[[σ]]θ holds in DM.
Proof: Let us first remark that, by Lemma 1.4.1, we have C σ C σ C⊩EE sigmaC \Vdash \exists \sigmaCσ. This may be written θ σ θ σ EE theta⊩EE sigma\exists \theta \Vdash \exists \sigmaθσ, which guarantees that [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ is defined. The proof is by structural induction on an HM ( = ) HM ( = ) HM(=)\operatorname{HM}(=)HM(=) typing derivation. We assume that the derivation is expressed in terms of the rules of Figure 1-8, but split HMDLETGEN into HMX-LET and HMX-GEN for the sake of readability.
  • Case HmD-VARInst. The rule's conclusion is C D , Γ x : T C D , Γ x : T C^^D,Gamma|--x:TC \wedge D, \Gamma \vdash \mathrm{x}: \mathrm{T}CD,Γx:T. By hypothesis, θ θ theta\thetaθ is a most general unifier of C D ( 1 ) C D ( 1 ) C^^D(1)C \wedge D(\mathbf{1})CD(1), and ftv ( T ) dom ( θ ) ftv ( T ) dom ( θ ) ftv(T)sube dom(theta)\operatorname{ftv}(\mathrm{T}) \subseteq \operatorname{dom}(\theta)ftv(T)dom(θ) (2)
    holds. The rule's premise is Γ ( x ) = σ Γ ( x ) = σ Gamma(x)=sigma\Gamma(\mathrm{x})=\sigmaΓ(x)=σ (3), where σ σ sigma\sigmaσ stands for x ¯ [ D ] x ¯ [ D ] AA bar(x)[D]\forall \overline{\mathrm{x}}[D]x¯[D].T. By (1), we have θ C D D x ¯ . D σ θ C D D x ¯ . D σ EE theta-=C^^D⊩D⊩EE bar(x).D-=EE sigma\exists \theta \equiv C \wedge D \Vdash D \Vdash \exists \overline{\mathrm{x}} . D \equiv \exists \sigmaθCDDx¯.Dσ (4). Furthermore, we have f t v ( σ ) f t v ( Γ ) dom ( θ ) f t v ( σ ) f t v ( Γ ) dom ( θ ) ftv(sigma)sube ftv(Gamma)sube dom(theta)f t v(\sigma) \subseteq f t v(\Gamma) \subseteq \operatorname{dom}(\theta)ftv(σ)ftv(Γ)dom(θ) (5). These facts show that [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ is defined. Together with (3), this implies [ [ Γ ] ] θ ( x ) = [ [ σ ] ] θ [ [ Γ ] ] θ ( x ) = [ [ σ ] ] θ [[Gamma]]_(theta)(x)=[[sigma]]_(theta)\llbracket \Gamma \rrbracket_{\theta}(\mathrm{x})=\llbracket \sigma \rrbracket_{\theta}[[Γ]]θ(x)=[[σ]]θ. By DM-VAR, [ [ Γ ] ] θ x : [ [ σ ] ] θ [ [ Γ ] ] θ x : [ [ σ ] ] θ [[Gamma]]_(theta)|--x:[[sigma]]_(theta)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{x}: \llbracket \sigma \rrbracket_{\theta}[[Γ]]θx:[[σ]]θ (6) follows. Now, by Lemma 1.3.19, we have D σ T D σ T D⊩sigma-<=TD \Vdash \sigma \preceq \mathrm{T}DσT, which, combined with θ D θ D EE theta⊩D\exists \theta \Vdash DθD, yields θ σ T θ σ T EE theta⊩sigma-<=T\exists \theta \Vdash \sigma \preceq \mathrm{T}θσT (7). By (7), (4), (5), (2), and Lemma 1.4.12, we find that θ ( T ) θ ( T ) theta(T)\theta(\mathrm{T})θ(T) is an instance of [ [ σ ] ] θ [ [ σ ] ] θ [[sigma]]_(theta)\llbracket \sigma \rrbracket_{\theta}[[σ]]θ. Thus, applying DM-INsT to (6) yields [ [ Γ ] ] θ t : θ ( T ) [ [ Γ ] ] θ t : θ ( T ) [[Gamma]]_(theta)|--t:theta(T)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \theta(\mathrm{T})[[Γ]]θt:θ(T).
  • Case HmD-ABs. The rule's conclusion is C , Γ λ z . t : T T C , Γ λ z . t : T T C,Gamma|--lambda z.t:TrarrT^(')C, \Gamma \vdash \lambda z . t: \mathrm{T} \rightarrow \mathrm{T}^{\prime}C,Γλz.t:TT. Its premise is C , ( Γ ; z : T ) t : T C , ( Γ ; z : T ) t : T C,(Gamma;z:T)|--t:T^(')C,(\Gamma ; \mathrm{z}: \mathrm{T}) \vdash \mathrm{t}: \mathrm{T}^{\prime}C,(Γ;z:T)t:T. Applying the induction hypothesis to it yields [ [ Γ ] ] θ ; z [ [ Γ ] ] θ ; z [[Gamma]]_(theta);z\llbracket \Gamma \rrbracket_{\theta} ; \mathrm{z}[[Γ]]θ;z : θ ( T ) t : θ ( T ) θ ( T ) t : θ T theta(T)|--t:theta(T^('))\theta(\mathrm{T}) \vdash \mathrm{t}: \theta\left(\mathrm{T}^{\prime}\right)θ(T)t:θ(T). By DM-ABS, this implies [ [ Γ ] ] θ λ [ [ Γ ] ] θ λ [[Gamma]]_(theta)|--lambda\llbracket \Gamma \rrbracket_{\theta} \vdash \lambda[[Γ]]θλ z.t : θ ( T ) θ ( T ) : θ ( T ) θ T :theta(T)rarr theta(T^(')): \theta(\mathrm{T}) \rightarrow \theta\left(\mathrm{T}^{\prime}\right):θ(T)θ(T), that is, [ [ Γ ] ] θ λ z . t : θ ( T T ) [ [ Γ ] ] θ λ z . t : θ T T [[Gamma]]_(theta)|--lambda z.t:theta(TrarrT^('))\llbracket \Gamma \rrbracket_{\theta} \vdash \lambda z . t: \theta\left(\mathrm{T} \rightarrow \mathrm{T}^{\prime}\right)[[Γ]]θλz.t:θ(TT).
  • Case HMd-Apr. By an extension of dom ( θ ) dom ( θ ) dom(theta)\operatorname{dom}(\theta)dom(θ) to include f t v ( T ) f t v ( T ) ftv(T)f t v(T)ftv(T), by the induction hypothesis, and by DM-APP.
  • Case HMX-Let. By an extension of dom ( θ ) dom ( θ ) dom(theta)\operatorname{dom}(\theta)dom(θ) to include f t v ( σ ) f t v ( σ ) ftv(sigma)f t v(\sigma)ftv(σ), by the induction hypothesis, and by DM-LET.
  • Case HmX-GEn. The rule's conclusion is C σ , Γ t : σ C σ , Γ t : σ C^^EE sigma,Gamma|--t:sigmaC \wedge \exists \sigma, \Gamma \vdash \mathrm{t}: \sigmaCσ,Γt:σ, where σ σ sigma\sigmaσ stands for X ¯ [ D ] X ¯ [ D ] AA bar(X)[D]\forall \overline{\mathrm{X}}[D]X¯[D].T. By hypothesis, θ θ theta\thetaθ is a most general unifier of C σ C σ C^^EE sigmaC \wedge \exists \sigmaCσ (1), and f t v ( Γ , σ ) dom ( θ ) ( 2 ) f t v ( Γ , σ ) dom ( θ ) ( 2 ) ftv(Gamma,sigma)sube dom(theta)(2)f t v(\Gamma, \sigma) \subseteq \operatorname{dom}(\theta) \mathbf{( 2 )}ftv(Γ,σ)dom(θ)(2) holds. The rule's premises are C D , Γ t : T ( 3 ) C D , Γ t : T ( 3 ) C^^D,Gamma|--t:T(3)C \wedge D, \Gamma \vdash \mathrm{t}: \mathrm{T}(\mathbf{3})CD,Γt:T(3) and X ¯ # f t v ( C , Γ ) X ¯ # f t v ( C , Γ ) bar(X)#ftv(C,Gamma)\overline{\mathrm{X}} \# \mathrm{ftv}(C, \Gamma)X¯#ftv(C,Γ) (4). We may further assume, w.l.o.g., X ¯ # θ X ¯ # θ bar(X)#theta\overline{\mathrm{X}} \# \thetaX¯#θ (5). Given (1), (2), and (5), we may define θ θ theta^(')\theta^{\prime}θ and Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ exactly as in Lemma 1.4.8. Then, θ θ theta^(')\theta^{\prime}θ is a most general unifier of θ D θ D EE theta^^D\exists \theta \wedge DθD, that is, C D C D C^^DC \wedge DCD. Furthermore, dom ( θ ) dom θ dom(theta^('))\operatorname{dom}\left(\theta^{\prime}\right)dom(θ) is dom ( θ ) x ¯ dom ( θ ) x ¯ dom(theta)uu bar(x)\operatorname{dom}(\theta) \cup \overline{\mathrm{x}}dom(θ)x¯, which by (2) is a superset of f t v ( Γ , T ) f t v ( Γ , T ) ftv(Gamma,T)f t v(\Gamma, \mathrm{T})ftv(Γ,T). Thus, the induction hypothesis applies to θ θ theta^(')\theta^{\prime}θ and to (3), yielding [ [ Γ ] ] θ t : θ ( T ) [ [ Γ ] ] θ t : θ ( T ) [[Gamma]]_(theta^('))|--t:theta^(')(T)\llbracket \Gamma \rrbracket_{\theta^{\prime}} \vdash \mathrm{t}: \theta^{\prime}(\mathrm{T})[[Γ]]θt:θ(T). Because θ θ theta^(')\theta^{\prime}θ extends θ θ theta\thetaθ, by (2) and by Lemma 1.4.11, this may be read [ [ Γ ] ] θ t : θ ( T ) [ [ Γ ] ] θ t : θ ( T ) [[Gamma]]_(theta)|--t:theta^(')(T)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \theta^{\prime}(\mathrm{T})[[Γ]]θt:θ(T) (6). According to Lemma 1.4.8, we have f t v ( [ [ Γ ] ] θ ) f t v [ [ Γ ] ] θ ftv(([[)Gamma]]_(theta))subef t v\left(\llbracket \Gamma \rrbracket_{\theta}\right) \subseteqftv([[Γ]]θ) range ( θ ) ( θ ) (theta)(\theta)(θ), which by construction of Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ implies Y ¯ # ftv ( [ [ Γ ] ] θ ) Y ¯ # ftv [ [ Γ ] ] θ bar(Y)#ftv(([[)Gamma]]_(theta))\overline{\mathrm{Y}} \# \operatorname{ftv}\left(\llbracket \Gamma \rrbracket_{\theta}\right)Y¯#ftv([[Γ]]θ) (7). By DM-GEN, (6) and (7) yield [ [ Γ ] ] θ t : Y ¯ θ ( T ) [ [ Γ ] ] θ t : Y ¯ θ ( T ) [[Gamma]]_(theta)|--t:AA bar(Y)*theta^(')(T)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \forall \overline{\mathrm{Y}} \cdot \theta^{\prime}(\mathrm{T})[[Γ]]θt:Y¯θ(T), that is, [ [ Γ ] ] θ t : [ [ σ ] ] θ [ [ Γ ] ] θ t : [ [ σ ] ] θ [[Gamma]]_(theta)|--t:[[sigma]]_(theta)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \llbracket \sigma \rrbracket_{\theta}[[Γ]]θt:[[σ]]θ.
  • Case HmD-Sub. The rule's conclusion is C , Γ t : T C , Γ t : T C,Gamma|--t:T^(')C, \Gamma \vdash \mathrm{t}: \mathrm{T}^{\prime}C,Γt:T. By hypothesis, θ θ theta\thetaθ is a most general unifier of C ( 1 ) C ( 1 ) C(1)C \mathbf{( 1 )}C(1), and f t v ( Γ , T ) dom ( θ ) f t v Γ , T dom ( θ ) ftv(Gamma,T^('))sube dom(theta)f t v\left(\Gamma, \mathrm{T}^{\prime}\right) \subseteq \operatorname{dom}(\theta)ftv(Γ,T)dom(θ) (2) holds. The goal is [ [ Γ ] ] θ t : θ ( T ) ( 3 ) [ [ Γ ] ] θ t : θ T ( 3 ) [[Gamma]]_(theta)|--t:theta(T^('))(3)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \theta\left(\mathrm{T}^{\prime}\right)(\mathbf{3})[[Γ]]θt:θ(T)(3). The rule's premises are C , Γ t : T ( 4 ) C , Γ t : T ( 4 ) C,Gamma|--t:T(4)C, \Gamma \vdash \mathrm{t}: \mathrm{T} \mathbf{( 4 )}C,Γt:T(4) and C T = T C T = T C⊩T=T^(')C \Vdash \mathrm{T}=\mathrm{T}^{\prime}CT=T (5). We may assume, w.l.o.g., ftv ( T ) # range ( θ ) ( T ) # range ( θ ) (T)#range(theta)(\mathrm{T}) \# \operatorname{range}(\theta)(T)#range(θ) (6). Then, by (6) and Lemma 1.3.43, we may extend the domain of θ θ theta\thetaθ, so as to achieve f t v ( T ) dom ( θ ) ( 7 ) f t v ( T ) dom ( θ ) ( 7 ) ftv(T)sube dom(theta)(7)f t v(\mathrm{~T}) \subseteq \operatorname{dom}(\theta)(7)ftv( T)dom(θ)(7), without compromising (1) or (2) or affecting the goal (3). By (1), (2), and (7), the induction hypothesis applies to (4), yielding [ [ Γ ] ] θ t : θ ( T ) ( 8 ) [ [ Γ ] ] θ t : θ ( T ) ( 8 ) [[Gamma]]_(theta)|--t:theta(T)(8)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \theta(\mathrm{T})(8)[[Γ]]θt:θ(T)(8). Now, thanks to (1), (5) may be read θ T = T read θ T = T read EE theta⊩T=T^(')\operatorname{read} \exists \theta \Vdash \mathrm{T}=\mathrm{T}^{\prime}readθT=T, which by Lemmas 1.3 .29 and 1.3.42 implies true θ ( T ) = θ ( T ) θ ( T ) = θ T ⊩theta(T)=theta(T^('))\Vdash \theta(\mathrm{T})=\theta\left(\mathrm{T}^{\prime}\right)θ(T)=θ(T). Then, Lemma 1.3.32 shows that θ ( T ) θ ( T ) theta(T)\theta(\mathrm{T})θ(T) and θ ( T ) θ T theta(T^('))\theta\left(\mathrm{T}^{\prime}\right)θ(T) coincide. As a result, (8) is the goal (3).
@\circ Case HmD-ExisTs. The rule's conclusion is X ¯ . C , Γ t X ¯ . C , Γ t EE bar(X).C,Gamma|--t\exists \overline{\mathrm{X}} . C, \Gamma \vdash \mathrm{t}X¯.C,Γt : T. By hypothesis, θ θ theta\thetaθ is a most general unifier of X ¯ . C ( 1 ) X ¯ . C ( 1 ) EE bar(X).C(1)\exists \overline{\mathrm{X}} . C(\mathbf{1})X¯.C(1), and f t v ( Γ , T ) dom ( θ ) f t v ( Γ , T ) dom ( θ ) ftv(Gamma,T)sube dom(theta)f t v(\Gamma, \mathrm{T}) \subseteq \operatorname{dom}(\theta)ftv(Γ,T)dom(θ) (2) holds. The
rule's premises are C , Γ t : T ( 3 ) C , Γ t : T ( 3 ) C,Gamma|--t:T(3)C, \Gamma \vdash \mathrm{t}: \mathrm{T}(\mathbf{3})C,Γt:T(3) and X ¯ # f t v ( Γ , T ) X ¯ # f t v ( Γ , T ) bar(X)#ftv(Gamma,T)\overline{\mathrm{X}} \# f t v(\Gamma, \mathrm{T})X¯#ftv(Γ,T). We may assume, w.l.o.g., x ¯ # θ x ¯ # θ bar(x)#theta\overline{\mathrm{x}} \# \thetax¯#θ (4). As in the previous case, we may extend the domain of θ θ theta\thetaθ to guarantee ftv ( X ¯ . C ) dom ( θ ) ftv ( X ¯ . C ) dom ( θ ) ftv(EE bar(X).C)sube dom(theta)\operatorname{ftv}(\exists \overline{\mathrm{X}} . C) \subseteq \operatorname{dom}(\theta)ftv(X¯.C)dom(θ) (5). By (1), (4), (5), and Lemma 1.3.45, there exists a type substitution θ θ theta^(')\theta^{\prime}θ such that θ θ theta^(')\theta^{\prime}θ extends θ θ theta\thetaθ (6) and θ θ theta^(')\theta^{\prime}θ is a most general unifier of C C CCC. Applying the induction hypothesis to θ θ theta^(')\theta^{\prime}θ and to (3) yields [ [ Γ ] ] θ t : θ ( T ) [ [ Γ ] ] θ t : θ ( T ) [[Gamma]]_(theta^('))|--t:theta^(')(T)\llbracket \Gamma \rrbracket_{\theta^{\prime}} \vdash \mathrm{t}: \theta^{\prime}(\mathrm{T})[[Γ]]θt:θ(T). By (2), (6), and Lemma 1.4.11, this may be read [ [ Γ ] ] θ t : θ ( T ) [ [ Γ ] ] θ t : θ ( T ) [[Gamma]]_(theta)|--t:theta(T)\llbracket \Gamma \rrbracket_{\theta} \vdash \mathrm{t}: \theta(\mathrm{T})[[Γ]]θt:θ(T).
Together, Theorems 1.4.7 and 1.4.13 yield a precise correspondence between D M D M DM\mathrm{DM}DM and HM ( = ) HM ( = ) HM(=)\operatorname{HM}(=)HM(=) : there exists a compositional translation from each to the other. In other words, they may be viewed as two equivalent formulations of a single type system. One might also say that H M ( = ) H M ( = ) HM(=)\mathrm{HM}(=)HM(=) is a constraint-based formulation of DM. Furthermore, Theorem 1.4.7 states that every member of the HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) family is an extension of DM. This explains our double interest in H M ( X ) H M ( X ) HM(X)\mathrm{HM}(X)HM(X), as an alternate formulation of D M D M DM\mathrm{DM}DM, which we believe is more pleasant, for reasons already discussed, and as a more expressive framework.

1.5 A purely constraint-based type system: PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X)

In the previous section, we have presented HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), an elegant constraintbased extension of Damas and Milner's type system. However, HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), as described there, suffers from a drawback. A typing judgement involves both a constraint, which represents an assumption about its free type variables, and an environment, which represents an assumption about its free program identifiers. At a let node, HMD-LETGEN turns a part of the current constraint, namely D D DDD, into a type scheme, namely x ¯ [ D ] . T x ¯ [ D ] . T AA bar(x)[D].T\forall \overline{\mathrm{x}}[D] . \mathrm{T}x¯[D].T, and stores it into the environment. Then, at every occurrence of the let-bound variable, HMD-VARINST retrieves this type scheme from the environment and adds a copy of D D DDD back to the current constraint. In practice, it is important to simplify the type scheme X ¯ [ D ] . T X ¯ [ D ] . T AA bar(X)[D].T\forall \overline{\mathrm{X}}[D] . \mathrm{T}X¯[D].T before it is stored in the environment, because it would be inefficient to copy an unsimplified constraint. In other words, it appears that, in order to preserve efficiency, constraint generation and constraint simplification cannot be separated.
Of course, in practice, it is not difficult to intermix these phases, so the problem is not technical, but pedagogical. Indeed, we argued earlier that it is natural and desirable to separate them. Type scheme introduction and elimination constraints, which we introduced in Section 1.3 but did not use in the specification of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), are intended as a means of solving this problem. In the present section, we exploit them to give a novel formulation of H M ( X ) H M ( X ) HM(X)\mathrm{HM}(X)HM(X), which no longer requires copying constraints back and forth between the environment and the constraint assumption. In fact, the environment is
C x T C x : T C x T C x : T (C⊩x-<=T)/(C|--x:T)\frac{C \Vdash \mathrm{x} \preceq \mathrm{T}}{C \vdash \mathrm{x}: \mathrm{T}}CxTCx:T ( V A R ) ( V A R ) (VAR)(\mathrm{VAR})(VAR) C 1 t 1 : T 1 C 2 t 2 : T 2 let z : V [ C 1 ] T 1 in C 2 C 1 t 1 : T 1 C 2 t 2 : T 2  let  z : V C 1 T 1  in  C 2 (C_(1)|--t_(1):T_(1)quadC_(2)|--t_(2):T_(2))/(" let "z:AAV[C_(1)]*T_(1)" in "C_(2))\frac{C_{1} \vdash \mathrm{t}_{1}: \mathrm{T}_{1} \quad C_{2} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}}{\text { let } \mathrm{z}: \forall \mathcal{V}\left[C_{1}\right] \cdot \mathrm{T}_{1} \text { in } C_{2}}C1t1:T1C2t2:T2 let z:V[C1]T1 in C2 (LET)
C t : T C t : T (C|--t:T^('))/()\frac{C \vdash \mathrm{t}: \mathrm{T}^{\prime}}{}Ct:T
let z = t 1 in t 2 : T 2  let  z = t 1  in  t 2 : T 2 (" let "z=t_(1)" in "t_(2):T_(2))/()\frac{\text { let } \mathrm{z}=\mathrm{t}_{1} \text { in } \mathrm{t}_{2}: \mathrm{T}_{2}}{} let z=t1 in t2:T2
C 1 t 1 : T T C 2 t 2 : T C 1 C 2 t 1 t 2 : T C 1 t 1 : T T C 2 t 2 : T C 1 C 2 t 1 t 2 : T (C_(1)|--t_(1):TrarrT^(')quadC_(2)|--t_(2):T)/(C_(1)^^C_(2)|--t_(1)t_(2):T^('))\frac{C_{1} \vdash \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime} \quad C_{2} \vdash \mathrm{t}_{2}: \mathrm{T}}{C_{1} \wedge C_{2} \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}}C1t1:TTC2t2:TC1C2t1t2:T ( A P P ) ( A P P ) (APP)(\mathrm{APP})(APP) C t : T C T T t : T C t : T C T T t : T (C|--t:T)/(C^^T <= T^(')|--t:T^('))\frac{C \vdash \mathrm{t}: \mathrm{T}}{C \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}}Ct:TCTTt:T (SUB)
C t : T x ¯ # f t v ( T ) x ¯ C t : T C t : T x ¯ # f t v ( T ) x ¯ C t : T (C|--t:Tquad bar(x)#ftv((T)))/(EE bar(x)*C|--t:T)\frac{C \vdash \mathrm{t}: \mathrm{T} \quad \overline{\mathrm{x}} \# f t v(\mathrm{~T})}{\exists \overline{\mathrm{x}} \cdot C \vdash \mathrm{t}: \mathrm{T}}Ct:Tx¯#ftv( T)x¯Ct:T (ExISTS)
(C⊩x-<=T)/(C|--x:T) (VAR) (C_(1)|--t_(1):T_(1)quadC_(2)|--t_(2):T_(2))/(" let "z:AAV[C_(1)]*T_(1)" in "C_(2)) (LET) (C|--t:T^('))/() (" let "z=t_(1)" in "t_(2):T_(2))/() (C_(1)|--t_(1):TrarrT^(')quadC_(2)|--t_(2):T)/(C_(1)^^C_(2)|--t_(1)t_(2):T^(')) (APP) (C|--t:T)/(C^^T <= T^(')|--t:T^(')) (SUB) (C|--t:Tquad bar(x)#ftv((T)))/(EE bar(x)*C|--t:T) (ExISTS)| $\frac{C \Vdash \mathrm{x} \preceq \mathrm{T}}{C \vdash \mathrm{x}: \mathrm{T}}$ | $(\mathrm{VAR})$ | $\frac{C_{1} \vdash \mathrm{t}_{1}: \mathrm{T}_{1} \quad C_{2} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}}{\text { let } \mathrm{z}: \forall \mathcal{V}\left[C_{1}\right] \cdot \mathrm{T}_{1} \text { in } C_{2}}$ | (LET) | | :---: | :---: | :---: | :---: | | $\frac{C \vdash \mathrm{t}: \mathrm{T}^{\prime}}{}$ | | | | | $\frac{\text { let } \mathrm{z}=\mathrm{t}_{1} \text { in } \mathrm{t}_{2}: \mathrm{T}_{2}}{}$ | | | | | $\frac{C_{1} \vdash \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime} \quad C_{2} \vdash \mathrm{t}_{2}: \mathrm{T}}{C_{1} \wedge C_{2} \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}}$ | $(\mathrm{APP})$ | $\frac{C \vdash \mathrm{t}: \mathrm{T}}{C \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}}$ | (SUB) | | | | $\frac{C \vdash \mathrm{t}: \mathrm{T} \quad \overline{\mathrm{x}} \# f t v(\mathrm{~T})}{\exists \overline{\mathrm{x}} \cdot C \vdash \mathrm{t}: \mathrm{T}}$ | (ExISTS) |
Figure 1-9: Typing rules for PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X)
suppressed altogether: taking advantage of the new constraint forms, we encode information about program identifiers within the constraint assumption.

Presentation

We now employ the full constraint language (Section 1.3). Typing judgements take the form C t C t C|--tC \vdash \mathrm{t}Ct : T T T\mathrm{T}T, where C C CCC may have free type variables and free program identifiers. The rules that allow deriving such judgements appear in Figure 1-9. As before, we identify judgements up to constraint equivalence.
Let us review the rules. VAR states that x x x\mathrm{x}x has type T T T\mathrm{T}T under any constraint that entails x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT. Note that we no longer consult the type scheme associated with x x x\mathrm{x}x in the environment-indeed, there is no environment. Instead, we let the constraint assumption record the fact that the type scheme should admit T T T\mathrm{T}T as one of its instances. Thus, in a judgement C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T, any program identifier that occurs free within t t ttt typically also occurs free within C C CCC. ABs requires the body t t t\mathrm{t}t of a λ λ lambda\lambdaλ-abstraction to have type T T T^(')\mathrm{T}^{\prime}T under assumption C C CCC. Although no explicit assumption about z z z\mathrm{z}z appears in the premise, C C CCC typically contains a number of instantiation constraints bearing on z z zzz, of the form z T i z T i z-<=T_(i)z \preceq T_{i}zTi. In the rule's conclusion, C C CCC is wrapped within the prefix let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in [ [ [[[, where T T T\mathrm{T}T is the type assigned to z z zzz. This effectively requires every T i T i T_(i)T_{i}Ti to denote a supertype of T T T\mathrm{T}T, as desired. Please note that z z z\mathbf{z}z does not occur free in the constraint let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C C CCC, which is natural, since it does not occur free in λ λ lambda\lambdaλ z.t. App exhibits a minor stylistic difference with respect to HMX-APP: its constraint assumption is split between its premises. It is not difficult to prove that, when weakening holds (see Lemma 1.5.2 below), this choice does not affect the set of valid judgements. This new presentation encourages reading the rules in Figure 1-9 as the specification of an algorithm, which, given t t ttt and T T TTT, pro-
duces C C CCC such that C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T holds. In the case of App, the algorithm invokes itself recursively for each of the two subexpressions, yielding the constraints C 1 C 1 C_(1)C_{1}C1 and C 2 C 2 C_(2)C_{2}C2, then constructs their conjunction. LET is analogous to ABS: by wrapping C 2 C 2 C_(2)C_{2}C2 within a let prefix, it gives meaning to the instantiation constraints bearing on z z z\mathrm{z}z within C 2 C 2 C_(2)C_{2}C2. The difference is that z z z\mathrm{z}z may now be assigned a type scheme, as opposed to a monotype. An appropriate type scheme is built in the most straightforward manner from the constraint C 1 C 1 C_(1)C_{1}C1 and the type T 1 T 1 T_(1)\mathrm{T}_{1}T1 that describe t 1 t 1 t_(1)t_{1}t1. All of the type variables that appear free in the left-hand premise are generalized, hence the notation V [ C 1 ] . T 1 V C 1 . T 1 AAV[C_(1)].T_(1)\forall \mathcal{V}\left[C_{1}\right] . \mathrm{T}_{1}V[C1].T1, which is a convenient shorthand for ftv ( C 1 , T 1 ) [ C 1 ] . T 1 ftv C 1 , T 1 C 1 . T 1 AA ftv(C_(1),T_(1))[C_(1)].T_(1)\forall \operatorname{ftv}\left(C_{1}, \mathrm{~T}_{1}\right)\left[C_{1}\right] . \mathrm{T}_{1}ftv(C1, T1)[C1].T1. The side-condition that "type variables that occur free in the environment must not be generalized", which was present in D M D M DM\mathrm{DM}DM and HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), naturally disappears, since judgements no longer involve an environment. SUB again exhibits a minor stylistic difference with respect to HMX-SUB: the comments made about APp above apply here as well. ExISTS is essentially identical to HMX-EXISTS.
In the standard specification of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), HMD-ABS and HMD-LET GEN accumulate information in the environment. Through the environment, this information is made available to HMD-VARINST, which retrieves and copies it. Here, instead, no information is explicitly transmitted. Where a program identifier is bound, a type scheme introduction constraint is built; where a program identifier is used, a type scheme instantiation constraint is produced. The two are related only by our definition of the meaning of constraints.
The reader may be puzzled by the fact that LET allows all type variables that occur free in its left-hand premise to be generalized. The following exercise sheds some light on this issue.
1.5.1 Exercise [ ***\star, Recommended]: Build a type derivation for the expression λ z 1 λ z 1 lambdaz_(1)\lambda z_{1}λz1. let z 2 = z 1 z 2 = z 1 z_(2)=z_(1)z_{2}=z_{1}z2=z1 in z 2 z 2 z_(2)z_{2}z2 within PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X). Draw a comparison with the solution of Exercise 1.2.21.
The following lemma is an analogue of Lemma 1.4.2.
1.5.2 Lemma [Weakening]: If C C C C C^(')⊩CC^{\prime} \Vdash CCC, then every derivation of C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T may be turned into a derivation of C t : T C t : T C^(')|--t:TC^{\prime} \vdash \mathrm{t}: \mathrm{T}Ct:T with the same shape.
Proof: The proof is by structural induction on a derivation of C t : T C t : T C|--t:TC \vdash t: TCt:T. In each proof case, we adopt the notations of Figure 1-9.
  • Case VAR. By transitivity of entailment.
  • Case ABs. The rule's conclusion is let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C λ C λ C|--lambdaC \vdash \lambdaCλ z.t : T T ( 1 ) T T ( 1 ) TrarrT^(')(1)\mathrm{T} \rightarrow \mathrm{T}^{\prime}(\mathbf{1})TT(1). By hypothesis, we have C C C^(')⊩C^{\prime} \VdashC let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C C CCC (2). We may assume, w.l.o.g., z f p i ( C ) ( 3 ) z f p i C ( 3 ) z!in fpi(C^('))(3)\mathrm{z} \notin f p i\left(C^{\prime}\right)(\mathbf{3})zfpi(C)(3). The rule's premise is C t : T ( 4 ) C t : T ( 4 ) C|--t:T^(')(4)C \vdash \mathrm{t}: \mathrm{T}^{\prime}(\mathbf{4})Ct:T(4). Applying the induction hypothesis to (4) yields C C t : T C C t : T C^^C^(')|--t:T^(')C \wedge C^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}CCt:T, which by ABs implies let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in ( C ( C (C^^(C \wedge(C
    C ) λ C λ {:C^('))|--lambda\left.C^{\prime}\right) \vdash \lambdaC)λ z.t : T T ( 5 ) T T ( 5 ) TrarrT^(')(5)\mathrm{T} \rightarrow \mathrm{T}^{\prime} \mathbf{( 5 )}TT(5). By (3) and C-INAND*, let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in ( C C ) C C (C^^C^('))\left(C \wedge C^{\prime}\right)(CC) is equivalent to (let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C C CCC ) C C ^^C^(')\wedge C^{\prime}C, which by (2) and C C C\mathrm{C}C-Dup is equivalent to C C C^(')C^{\prime}C. Thus, (5) is the goal C λ z C λ z C^(')|--lambda zC^{\prime} \vdash \lambda zCλz.t : T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT.
  • Case APp. By applying the induction hypothesis to each premise, using the fact that C C 1 C 2 C C 1 C 2 C^(')⊩C_(1)^^C_(2)C^{\prime} \Vdash C_{1} \wedge C_{2}CC1C2 implies C C 1 C C 1 C^(')⊩C_(1)C^{\prime} \Vdash C_{1}CC1 and C C 2 C C 2 C^(')⊩C_(2)C^{\prime} \Vdash C_{2}CC2.
  • Case LET. Analogous to the case of ABS. The induction hypothesis is applied to the second premise only.
  • Case Sub. Analogous to the case of Apr.
  • Case ExISTs. See the corresponding case in the proof of Lemma 1.4.2.

Relating PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) with HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X)

Let us now provide evidence for our claim that PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) is an alternate presentation of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). The next two theorems define an effective translation from HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) to PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) and back.
The first theorem states that if, within HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), t t t\mathrm{t}t has type T T T\mathrm{T}T under assumptions C C CCC and Γ Γ Gamma\GammaΓ, then, within PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X), t t t\mathrm{t}t also has type T T T\mathrm{T}T, under some assumption C C C^(')C^{\prime}C. The relationship C C C⊩C \VdashC let Γ Γ Gamma\GammaΓ in C C C^(')C^{\prime}C states that C C CCC entails the residual constraint obtained by confronting Γ Γ Gamma\GammaΓ, which provides information about the free program identifiers in t t ttt, with C C C^(')C^{\prime}C, which contains instantiation constraints bearing on these program identifiers. The statement requires C C CCC and Γ Γ Gamma\GammaΓ to have no free program identifiers, which is natural, since they are part of an HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) judgement. The hypothesis C Γ C Γ C⊩EE GammaC \Vdash \exists \GammaCΓ excludes the somewhat pathological situation where Γ Γ Gamma\GammaΓ contains constraints not apparent in C C CCC. This hypothesis vanishes when Γ Γ Gamma\GammaΓ is the initial environment; see Definition 1.7.3.
1.5.3 Theorem: Let C Γ C Γ C⊩EE GammaC \Vdash \exists \GammaCΓ. Assume f p i ( C , Γ ) = f p i ( C , Γ ) = fpi(C,Gamma)=O/f p i(C, \Gamma)=\varnothingfpi(C,Γ)=. If C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T holds in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), then there exists a constraint C C C^(')C^{\prime}C such that C t : T C t : T C^(')|--t:TC^{\prime} \vdash \mathrm{t}: \mathrm{T}Ct:T holds in PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) and C C CCC entails let Γ Γ Gamma\GammaΓ in C C C^(')C^{\prime}C.
Proof: The proof is by structural induction on a derivation of C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T. In each proof case, we adopt the notations of Figure 1-8.
  • Case HmD-VarInst. The rule's conclusion is C D , Γ x : T C D , Γ x : T C^^D,Gamma|--x:TC \wedge D, \Gamma \vdash \mathrm{x}: \mathrm{T}CD,Γx:T. By hypothesis, we have C D Γ C D Γ C^^D⊩EE GammaC \wedge D \Vdash \exists \GammaCDΓ (1) and f p i ( C , D , Γ ) = f p i ( C , D , Γ ) = fpi(C,D,Gamma)=O/f p i(C, D, \Gamma)=\varnothingfpi(C,D,Γ)= (2). The rule's premise is Γ ( x ) = x ¯ [ D ] . T Γ ( x ) = x ¯ [ D ] . T Gamma(x)=AA bar(x)[D].T\Gamma(\mathrm{x})=\forall \overline{\mathrm{x}}[D] . \mathrm{T}Γ(x)=x¯[D].T (3). By VAR, we have x T x : T x T x : T x-<=T|--x:T\mathrm{x} \preceq \mathrm{T} \vdash \mathrm{x}: \mathrm{T}xTx:T, so there remains to establish C D C D C^^D⊩C \wedge D \VdashCD let Γ Γ Gamma\GammaΓ in x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT (4). By (3), (2), and C-INID, the constraint let Γ Γ Gamma\GammaΓ in x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT is equivalent to let Γ Γ Gamma\GammaΓ in x ¯ [ D ] x ¯ [ D ] AA bar(x)[D]\forall \overline{\mathrm{x}}[D]x¯[D]. T T T T T-<=T\mathrm{T} \preceq \mathrm{T}TT, which, by (2) and C-IN*, is itself equivalent to Γ x ¯ [ D ] . T T Γ x ¯ [ D ] . T T EE Gamma^^AA bar(x)[D].T-<=T\exists \Gamma \wedge \forall \overline{\mathrm{x}}[D] . \mathrm{T} \preceq \mathrm{T}Γx¯[D].TT (5). By (1) and Lemma 1.3.19, C D C D C^^DC \wedge DCD entails (5). We have established (4).
  • Case HmD-ABs. The rule's conclusion is C , Γ λ z . t : T T C , Γ λ z . t : T T C,Gamma|--lambda z.t:TrarrT^(')C, \Gamma \vdash \lambda z . \mathrm{t}: \mathrm{T} \rightarrow \mathrm{T}^{\prime}C,Γλz.t:TT. Its premise is C , ( Γ ; z : T ) t : T ( 1 ) C , ( Γ ; z : T ) t : T ( 1 ) C,(Gamma;z:T)|--t:T^(')(1)C,(\Gamma ; \mathbf{z}: \mathrm{T}) \vdash \mathrm{t}: \mathrm{T}^{\prime}(\mathbf{1})C,(Γ;z:T)t:T(1). The constraints Γ Γ EE Gamma\exists \GammaΓ and ( Γ ; z : T ) ( Γ ; z : T ) EE(Gamma;z:T)\exists(\Gamma ; \mathbf{z}: \mathrm{T})(Γ;z:T) are equivalent,
    so the induction hypothesis applies to (1) and yields a constraint C C C^(')C^{\prime}C such that C t : T ( 2 ) C t : T ( 2 ) C^(')|--t:T^(')(2)C^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}(\mathbf{2})Ct:T(2) and C C C⊩C \VdashC let Γ ; z : T Γ ; z : T Gamma;z:T\Gamma ; \mathrm{z}: \mathrm{T}Γ;z:T in C C C^(')C^{\prime}C (3). Applying ABs to (2) yields let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C λ C λ C^(')|--lambdaC^{\prime} \vdash \lambdaCλ z.t : T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT. There remains to check that C C CCC entails let Γ Γ Gamma\GammaΓ in let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C C C^(')C^{\prime}C —but that is precisely (3).
  • Case HMD-App. The rule's conclusion is C , Γ t 1 t 2 : T C , Γ t 1 t 2 : T C,Gamma|--t_(1)t_(2):T^(')C, \Gamma \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}C,Γt1t2:T. Its premises are C , Γ t 1 : T T ( 1 ) C , Γ t 1 : T T ( 1 ) C,Gamma|--t_(1):TrarrT^(')(1)C, \Gamma \vdash \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime}(\mathbf{1})C,Γt1:TT(1) and C , Γ t 2 : T C , Γ t 2 : T C,Gamma|--t_(2):TC, \Gamma \vdash \mathrm{t}_{2}: \mathrm{T}C,Γt2:T (2). Applying the induction hypothesis to (1) and (2), we obtain constraints C 1 C 1 C_(1)^(')C_{1}^{\prime}C1 and C 2 C 2 C_(2)^(')C_{2}^{\prime}C2 such that C 1 C 1 C_(1)^(')|--C_{1}^{\prime} \vdashC1 t 1 : T T ( 3 ) t 1 : T T ( 3 ) t_(1):TrarrT^(')(3)\mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime}(\mathbf{3})t1:TT(3) and C 2 t 2 : T ( 4 ) C 2 t 2 : T ( 4 ) C_(2)^(')|--t_(2):T(4)C_{2}^{\prime} \vdash \mathrm{t}_{2}: \mathrm{T}(4)C2t2:T(4) and C C C⊩C \VdashC let Γ Γ Gamma\GammaΓ in C 1 ( 5 ) C 1 ( 5 ) C_(1)^(')(5)C_{1}^{\prime}(\mathbf{5})C1(5) and C C C⊩C \VdashC let Γ Γ Gamma\GammaΓ in C 2 ( 6 ) C 2 ( 6 ) C_(2)^(')(6)C_{2}^{\prime}(\mathbf{6})C2(6). By App, (3) and (4) imply C 1 C 2 t 1 t 2 : T C 1 C 2 t 1 t 2 : T C_(1)^(')^^C_(2)^(')|--t_(1)t_(2):T^(')C_{1}^{\prime} \wedge C_{2}^{\prime} \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}C1C2t1t2:T. Furthermore, by C-InAnd, (5) and (6) yield C C C⊩C \VdashC let Γ Γ Gamma\GammaΓ in C 1 C 2 C 1 C 2 C_(1)^(')^^C_(2)^(')C_{1}^{\prime} \wedge C_{2}^{\prime}C1C2.
  • Case hmd-LetGen. The rule's conclusion is C x ¯ . D , Γ C x ¯ . D , Γ C^^EE bar(x).D,Gamma|--C \wedge \exists \overline{\mathrm{x}} . D, \Gamma \vdashCx¯.D,Γ let z = z = z=\mathrm{z}=z= t 1 t 1 t_(1)\mathrm{t}_{1}t1 in t 2 : T 2 t 2 : T 2 t_(2):T_(2)\mathrm{t}_{2}: \mathrm{T}_{2}t2:T2. By hypothesis, we have C x ¯ . D Γ ( 1 ) C x ¯ . D Γ ( 1 ) C^^EE bar(x).D⊩EE Gamma(1)C \wedge \exists \overline{\mathrm{x}} . D \Vdash \exists \Gamma(\mathbf{1})Cx¯.DΓ(1) and f p i ( C , D , Γ ) = f p i ( C , D , Γ ) = fpi(C,D,Gamma)=f p i(C, D, \Gamma)=fpi(C,D,Γ)= O/\varnothing (2). The rule's premises are C D , Γ t 1 : T 1 C D , Γ t 1 : T 1 C^^D,Gamma|--t_(1):T_(1)C \wedge D, \Gamma \vdash \mathrm{t}_{1}: \mathrm{T}_{1}CD,Γt1:T1 (3) and x ¯ # f t v ( C , Γ ) x ¯ # f t v ( C , Γ ) bar(x)#ftv(C,Gamma)\overline{\mathrm{x}} \# f t v(C, \Gamma)x¯#ftv(C,Γ) (4) and C X ¯ . D , Γ t 2 : T 2 C X ¯ . D , Γ t 2 : T 2 C^^EE bar(X).D,Gamma^(')|--t_(2):T_(2)C \wedge \exists \overline{\mathrm{X}} . D, \Gamma^{\prime} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}CX¯.D,Γt2:T2 (5), where Γ Γ Gamma^(')\Gamma^{\prime}Γ is Γ ; z : x ¯ [ D ] . T 1 Γ ; z : x ¯ [ D ] . T 1 Gamma;z:AA bar(x)[D].T_(1)\Gamma ; \mathrm{z}: \forall \overline{\mathrm{x}}[D] . \mathrm{T}_{1}Γ;z:x¯[D].T1. Applying the induction hypothesis to (3) yields a constraint C 1 C 1 C_(1)^(')C_{1}^{\prime}C1 such that C 1 t 1 : T 1 C 1 t 1 : T 1 C_(1)^(')|--t_(1):T_(1)C_{1}^{\prime} \vdash \mathrm{t}_{1}: \mathrm{T}_{1}C1t1:T1 (6) and C D C D C^^D⊩C \wedge D \VdashCD let Γ Γ Gamma\GammaΓ in C 1 ( 7 ) C 1 ( 7 ) C_(1)^(')(7)C_{1}^{\prime}(\mathbf{7})C1(7). By (1), (2), and C-IN*, we have C x ¯ . D Γ C x ¯ . D Γ C^^EE bar(x).D⊩EEGamma^(')C \wedge \exists \overline{\mathrm{x}} . D \Vdash \exists \Gamma^{\prime}Cx¯.DΓ. Thus, the induction hypothesis applies to (5) and yields a constraint C 2 C 2 C_(2)^(')C_{2}^{\prime}C2 such that C 2 t 2 : T 2 C 2 t 2 : T 2 C_(2)^(')|--t_(2):T_(2)C_{2}^{\prime} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}C2t2:T2 (8) and C x ¯ C x ¯ C^^EE bar(x)C \wedge \exists \overline{\mathrm{x}}Cx¯. D D D⊩D \VdashD let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 C 2 C_(2)^(')C_{2}^{\prime}C2 (9). By LET, (6) and (8) imply let z : V [ C 1 ] T 1 z : V C 1 T 1 z:AAV[C_(1)^(')]*T_(1)\mathrm{z}: \forall \mathcal{V}\left[C_{1}^{\prime}\right] \cdot \mathrm{T}_{1}z:V[C1]T1 in C 2 C 2 C_(2)^(')|--C_{2}^{\prime} \vdashC2 let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 : T 2 t 2 : T 2 t_(2):T_(2)\mathrm{t}_{2}: \mathrm{T}_{2}t2:T2 (10). By Lemmas 1.3 .25 and 1.5.2, (10) yields let z : x ¯ [ C 1 ] T 1 z : x ¯ C 1 T 1 z:AA bar(x)[C_(1)^(')]*T_(1)\mathrm{z}: \forall \overline{\mathrm{x}}\left[C_{1}^{\prime}\right] \cdot \mathrm{T}_{1}z:x¯[C1]T1 in C 2 C 2 C_(2)^(')|--C_{2}^{\prime} \vdashC2 let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 : T 2 t 2 : T 2 t_(2):T_(2)\mathrm{t}_{2}: \mathrm{T}_{2}t2:T2 (11), where the universal quantification is over X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ only. There remains to establish that C x ¯ . D C x ¯ . D C^^EE bar(x).DC \wedge \exists \overline{\mathrm{x}} . DCx¯.D entails let Γ ; z : x ¯ [ C 1 ] . T 1 Γ ; z : x ¯ C 1 . T 1 Gamma;z:AA bar(x)[C_(1)^(')].T_(1)\Gamma ; \mathrm{z}: \forall \overline{\mathrm{x}}\left[C_{1}^{\prime}\right] . \mathrm{T}_{1}Γ;z:x¯[C1].T1 in C 2 C 2 C_(2)^(')C_{2}^{\prime}C2 (12). By (4), (2), and CLETDup, the constraint (12) is equivalent to let Γ ; z : X ¯ [ Γ ; z : X ¯ Gamma;z:AA bar(X)[:}\Gamma ; \mathbf{z}: \forall \overline{\mathrm{X}}\left[\right.Γ;z:X¯[ let Γ Γ Gamma\GammaΓ in C 1 ] . T 1 C 1 . T 1 {:C_(1)^(')].T_(1)\left.C_{1}^{\prime}\right] . \mathrm{T}_{1}C1].T1 in C 2 C 2 C_(2)^(')C_{2}^{\prime}C2. By (7), this constraint is entailed by let Γ ; z : x ¯ [ C D ] . T 1 Γ ; z : x ¯ [ C D ] . T 1 Gamma;z:AA bar(x)[C^^D].T_(1)\Gamma ; \mathrm{z}: \forall \overline{\mathrm{x}}[C \wedge D] . \mathrm{T}_{1}Γ;z:x¯[CD].T1 in C 2 C 2 C_(2)^(')C_{2}^{\prime}C2, which by (4) and C-LetAnd, is equivalent to C C C^^C \wedgeC let Γ ; z : x ¯ [ D ] . T 1 Γ ; z : x ¯ [ D ] . T 1 Gamma;z:AA bar(x)[D].T_(1)\Gamma ; \mathbf{z}: \forall \overline{\mathrm{x}}[D] . \mathrm{T}_{1}Γ;z:x¯[D].T1 in C 2 C 2 C_(2)^(')C_{2}^{\prime}C2, that is, C C C^^C \wedgeC let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 C 2 C_(2)^(')C_{2}^{\prime}C2. By (9), this constraint is entailed by C x ¯ C x ¯ C^^EE bar(x)C \wedge \exists \overline{\mathrm{x}}Cx¯. D D DDD.
  • Case HmD-Sub. The rule's conclusion is C , Γ t : T C , Γ t : T C,Gamma|--t:T^(')C, \Gamma \vdash \mathrm{t}: \mathrm{T}^{\prime}C,Γt:T. Its premises are C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T (1) and C T T ( 2 ) C T T ( 2 ) C⊩T <= T^(')(2)C \Vdash \mathrm{T} \leq \mathrm{T}^{\prime} \mathbf{( 2 )}CTT(2). Applying the induction hypothesis to (1) yields a constraint C C C^(')C^{\prime}C such that C t : T C t : T C^(')|--t:TC^{\prime} \vdash \mathrm{t}: \mathrm{T}Ct:T (3) and C C C⊩C \VdashC let Γ Γ Gamma\GammaΓ in C C C^(')C^{\prime}C (4). By Sub, (3) implies C T T t : T C T T t : T C^(')^^T <= T^(')|--t:T^(')C^{\prime} \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}CTTt:T. There remains to establish C C C⊩C \VdashC let Γ Γ Gamma\GammaΓ in ( C T T ) C T T (C^(')^^T <= T^('))\left(C^{\prime} \wedge \mathrm{T} \leq \mathrm{T}^{\prime}\right)(CTT), which follows from (4) and (2) by C-InAnd*.
@\circ Case HmD-Exists. The rule's conclusion is x ¯ . C , Γ t : T x ¯ . C , Γ t : T EE bar(x).C,Gamma|--t:T\exists \overline{\mathrm{x}} . C, \Gamma \vdash \mathrm{t}: \mathrm{T}x¯.C,Γt:T. Its premises are C , Γ t : T ( 1 ) C , Γ t : T ( 1 ) C,Gamma|--t:T(1)C, \Gamma \vdash \mathrm{t}: \mathrm{T}(\mathbf{1})C,Γt:T(1) and x ¯ # f t v ( Γ , T ) ( 2 ) x ¯ # f t v ( Γ , T ) ( 2 ) bar(x)#ftv(Gamma,T)(2)\overline{\mathrm{x}} \# f t v(\Gamma, \mathrm{T})(\mathbf{2})x¯#ftv(Γ,T)(2). By hypothesis, we have x ¯ . C x ¯ . C EE bar(x).C⊩\exists \overline{\mathrm{x}} . C \Vdashx¯.C Γ Γ EE Gamma\exists \GammaΓ, which by Lemma 1.3.16 implies C Γ C Γ C⊩EE GammaC \Vdash \exists \GammaCΓ. Thus, the induction hypothesis applies to (1) and yields a constraint C C C^(')C^{\prime}C such that C t : T C t : T C^(')|--t:TC^{\prime} \vdash \mathrm{t}: \mathrm{T}Ct:T (3) and C C C⊩C \VdashC let Γ Γ Gamma\GammaΓ in C C C^(')C^{\prime}C (4). By Exists, (3) and (2) imply x ¯ . C t x ¯ . C t EE bar(x).C^(')|--t\exists \overline{\mathrm{x}} . C^{\prime} \vdash \mathrm{t}x¯.Ct : T. There remains to establish X ¯ . C X ¯ . C EE bar(X).C⊩\exists \overline{\mathrm{X}} . C \VdashX¯.C let Γ Γ Gamma\GammaΓ in x ¯ . C x ¯ . C EE bar(x).C^(')\exists \overline{\mathrm{x}} . C^{\prime}x¯.C. By congruence of entailment, (4) implies X ¯ . C X ¯ X ¯ . C X ¯ EE bar(X).C⊩EE bar(X)\exists \overline{\mathrm{X}} . C \Vdash \exists \overline{\mathrm{X}}X¯.CX¯.let Γ Γ Gamma\GammaΓ in C C C^(')C^{\prime}C. The result follows by (2) and C-InEx.
The second theorem states that if, within PCB ( X ) , t PCB ( X ) , t PCB(X),t\operatorname{PCB}(X), \mathrm{t}PCB(X),t has type T T T\mathrm{T}T under assumption C C CCC, then, within HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), t t t\mathrm{t}t also has type T T T\mathrm{T}T, under assumptions
let Γ Γ Gamma\GammaΓ in C C CCC and Γ Γ Gamma\GammaΓ. The idea is simple: the constraint C C CCC represents a combined assumption about the initial judgement's free type variables and free program identifiers. In HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X), these two kinds of assumptions must be maintained separately. So, we split them into a pair of an environment Γ Γ Gamma\GammaΓ, which may be chosen arbitrarily, provided it satisfies f p i ( C ) d p i ( Γ ) f p i ( C ) d p i ( Γ ) fpi(C)sube dpi(Gamma)f p i(C) \subseteq d p i(\Gamma)fpi(C)dpi(Γ)-that is, provided it defines all program variables of interest, and the residual constraint let Γ Γ Gamma\GammaΓ in C C CCC, which has no free program identifiers, thus represents an assumption about the new judgement's type variables only. Distinct choices of Γ Γ Gamma\GammaΓ give rise to distinct HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) judgements, which may be incomparable; this is related to the fact that ML-the-type-system does not have principal typings (Jim, 1995). Again, the hypothesis f p i ( Γ ) = f p i ( f p i ( Γ ) = f p i ( fpi(Gamma)=fpi(f p i(\Gamma)=f p i(fpi(Γ)=fpi( let Γ Γ Gamma\GammaΓ in C ) = C ) = C)=O/C)=\varnothingC)= is natural, since we wish Γ Γ Gamma\GammaΓ and let Γ Γ Gamma\GammaΓ in C C CCC to appear in an HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) judgement.
1.5.4 Theorem: Assume f p i ( Γ ) = f p i ( f p i ( Γ ) = f p i ( fpi(Gamma)=fpi(f p i(\Gamma)=f p i(fpi(Γ)=fpi( let Γ Γ Gamma\GammaΓ in C ) = C ) = C)=O/C)=\varnothingC)= and C C C≢C \not \equivC false. If C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T holds in PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X), then let Γ Γ Gamma\GammaΓ in C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T holds in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X).
Proof: The proof is by structural induction on a derivation of C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T. In each proof case, we adopt the notations of Figure 1-9.
By Lemma 1.3.30, the hypothesis C C C≢C \not \equivC false is preserved whenever the induction hypothesis is invoked. It is explicitly used only in case VAR, where it guarantees that the identifier at hand is bound in Γ Γ Gamma\GammaΓ.
  • Case VAR. The rule's conclusion is C x C x C|--xC \vdash \mathrm{x}Cx : T. Its premise is C x C x C⊩x-<=C \Vdash \mathrm{x} \preceqCx T (1). By Lemma 1.3.24, (1) and the hypothesis C C C≢C \not \equivC false imply x f p i ( C ) x f p i ( C ) xin fpi(C)\mathrm{x} \in f p i(C)xfpi(C). Because let Γ Γ Gamma\GammaΓ in C C CCC has no free program identifiers, this implies x d p i ( Γ ) x d p i ( Γ ) xin dpi(Gamma)\mathrm{x} \in d p i(\Gamma)xdpi(Γ), that is, the environment Γ Γ Gamma\GammaΓ must define x x x\mathrm{x}x. Let Γ ( x ) = x ¯ [ D ] T Γ ( x ) = x ¯ [ D ] T Gamma(x)=AA bar(x)[D]*T^(')\Gamma(\mathrm{x})=\forall \overline{\mathrm{x}}[D] \cdot \mathrm{T}^{\prime}Γ(x)=x¯[D]T (2), where X ¯ # ftv ( Γ , T ) X ¯ # ftv ( Γ , T ) bar(X)#ftv(Gamma,T)\overline{\mathrm{X}} \# \operatorname{ftv}(\Gamma, \mathrm{T})X¯#ftv(Γ,T) (3). By (2), HMD-VARInst, and HMD-SuB, we have D T D T D^^T^(') <=D \wedge \mathrm{T}^{\prime} \leqDT T , Γ x : T T , Γ x : T T,Gamma|--x:T\mathrm{T}, \Gamma \vdash \mathrm{x}: \mathrm{T}T,Γx:T. By (3) and HMD-ExisTs, this implies X ¯ . ( D T T ) , Γ x : T X ¯ . D T T , Γ x : T EE bar(X).(D^^T^(') <= T),Gamma|--x:T\exists \overline{\mathrm{X}} .\left(D \wedge \mathrm{T}^{\prime} \leq \mathrm{T}\right), \Gamma \vdash \mathrm{x}: \mathrm{T}X¯.(DTT),Γx:T (4). Now, by (3), the constraint X ¯ . ( D T T ) X ¯ . D T T EE bar(X).(D^^T^(') <= T)\exists \overline{\mathrm{X}} .\left(D \wedge \mathrm{T}^{\prime} \leq \mathrm{T}\right)X¯.(DTT) may be written X ¯ [ D ] . T T X ¯ [ D ] . T T AA bar(X)[D].T^(')-<=T\forall \overline{\mathrm{X}}[D] . \mathrm{T}^{\prime} \preceq \mathrm{T}X¯[D].TT (5). The hypothesis f p i ( Γ ) = f p i ( Γ ) = fpi(Gamma)=O/f p i(\Gamma)=\varnothingfpi(Γ)= implies f p i ( D ) = f p i ( D ) = fpi(D)=O/f p i(D)=\varnothingfpi(D)= (6). By (6), C-InID and C I N , ( 5 ) I N , ( 5 ) IN^(**),(5)\mathrm{IN}^{*},(5)IN,(5) is equivalent to let Γ Γ Gamma\GammaΓ in x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT. Thus, (4) may be written let Γ Γ Gamma\GammaΓ in x x x-<=\mathrm{x} \preceqx T , Γ x : T T , Γ x : T T,Gamma|--x:T\mathrm{T}, \Gamma \vdash \mathrm{x}: \mathrm{T}T,Γx:T. By (1), by congruence of entailment, and by Lemma 1.4.2, this implies let Γ Γ Gamma\GammaΓ in C , Γ x : T C , Γ x : T C,Gamma|--x:TC, \Gamma \vdash \mathrm{x}: \mathrm{T}C,Γx:T.
  • Case ABs. The rule's conclusion is let z : T T T\mathrm{T}T in C λ C λ C|--lambdaC \vdash \lambdaCλ z.t : T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT. Its premise is C t : T ( 1 ) C t : T ( 1 ) C|--t:T^(')(1)C \vdash \mathrm{t}: \mathrm{T}^{\prime} \mathbf{( 1 )}Ct:T(1). Let Γ Γ Gamma^(')\Gamma^{\prime}Γ stand for Γ ; z : T Γ ; z : T Gamma;z:T\Gamma ; \mathrm{z}: \mathrm{T}Γ;z:T. Applying the induction hypothesis to (1) yields let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C , Γ t : T C , Γ t : T C,Gamma^(')|--t:T^(')C, \Gamma^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}C,Γt:T. By HMD-ABS, this implies let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C , Γ λ z C , Γ λ z C,Gamma|--lambda zC, \Gamma \vdash \lambda zC,Γλz.t : T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT.
  • Case App. The rule's conclusion is C 1 C 2 t 1 t 2 : T C 1 C 2 t 1 t 2 : T C_(1)^^C_(2)|--t_(1)t_(2):T^(')C_{1} \wedge C_{2} \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}C1C2t1t2:T. Its premises are C 1 t 1 : T T C 1 t 1 : T T C_(1)|--t_(1):TrarrT^(')C_{1} \vdash \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime}C1t1:TT and C 2 t 2 : T C 2 t 2 : T C_(2)|--t_(2):TC_{2} \vdash \mathrm{t}_{2}: \mathrm{T}C2t2:T. Applying the induction hypothesis yields respectively let Γ Γ Gamma\GammaΓ in C 1 , Γ t 1 : T T C 1 , Γ t 1 : T T C_(1),Gamma|--t_(1):TrarrT^(')C_{1}, \Gamma \vdash \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime}C1,Γt1:TT and let Γ Γ Gamma\GammaΓ in C 2 , Γ t 2 : T C 2 , Γ t 2 : T C_(2),Gamma|--t_(2):TC_{2}, \Gamma \vdash \mathrm{t}_{2}: \mathrm{T}C2,Γt2:T, which by Lemma 1.4.2 and HMD-APP imply let Γ Γ Gamma\GammaΓ in ( C 1 C 2 ) , Γ t 1 t 2 : T C 1 C 2 , Γ t 1 t 2 : T (C_(1)^^C_(2)),Gamma|--t_(1)t_(2):T^(')\left(C_{1} \wedge C_{2}\right), \Gamma \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}(C1C2),Γt1t2:T.
@\circ Case LET. The rule's conclusion is let z : V [ C 1 ] T 1 z : V C 1 T 1 z:AAV[C_(1)]*T_(1)\mathrm{z}: \forall \mathcal{V}\left[C_{1}\right] \cdot \mathrm{T}_{1}z:V[C1]T1 in C 2 C 2 C_(2)|--C_{2} \vdashC2 let z = z = z=\mathrm{z}=z=
t 1 t 1 t_(1)\mathrm{t}_{1}t1 in t 2 : T 2 t 2 : T 2 t_(2):T_(2)\mathrm{t}_{2}: \mathrm{T}_{2}t2:T2. Its premises are C 1 t 1 : T 1 ( 1 ) C 1 t 1 : T 1 ( 1 ) C_(1)|--t_(1):T_(1)(1)C_{1} \vdash \mathrm{t}_{1}: \mathrm{T}_{1} \mathbf{( 1 )}C1t1:T1(1) and C 2 t 2 : T 2 ( 2 ) C 2 t 2 : T 2 ( 2 ) C_(2)|--t_(2):T_(2)(2)C_{2} \vdash \mathrm{t}_{2}: \mathrm{T}_{2} \mathbf{( 2 )}C2t2:T2(2). Let X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ stand for f t v ( C 1 , T 1 ) f t v C 1 , T 1 ftv(C_(1),T_(1))f t v\left(C_{1}, \mathrm{~T}_{1}\right)ftv(C1, T1). We may require, w.l.o.g., X ¯ # f t v ( Γ , C 2 ) X ¯ # f t v Γ , C 2 bar(X)#ftv(Gamma,C_(2))\overline{\mathrm{X}} \# f t v\left(\Gamma, C_{2}\right)X¯#ftv(Γ,C2) (3). By hypothesis, we have f p i ( Γ ) = ( 4 ) f p i ( Γ ) = ( 4 ) fpi(Gamma)=O/(4)f p i(\Gamma)=\varnothing(4)fpi(Γ)=(4). We also have f p i ( f p i fpi(:}f p i\left(\right.fpi( let Γ ; z : V [ C 1 ] . T 1 Γ ; z : V C 1 . T 1 Gamma;z:AAV[C_(1)].T_(1)\Gamma ; \mathbf{z}: \forall \mathcal{V}\left[C_{1}\right] . \mathrm{T}_{1}Γ;z:V[C1].T1 in C 2 ) = C 2 = {:C_(2))=O/\left.C_{2}\right)=\varnothingC2)=, which implies f p i ( f p i fpi(:}f p i\left(\right.fpi( let Γ Γ Gamma\GammaΓ in C 1 ) = C 1 = {:C_(1))=O/\left.C_{1}\right)=\varnothingC1)=. Thus, the induction hypothesis applies to (1) and yields let Γ Γ Gamma\GammaΓ in C 1 , Γ t 1 : T 1 C 1 , Γ t 1 : T 1 C_(1),Gamma|--t_(1):T_(1)C_{1}, \Gamma \vdash \mathrm{t}_{1}: \mathrm{T}_{1}C1,Γt1:T1 (5). Now, let σ σ sigma\sigmaσ stand for X ¯ [ X ¯ AA bar(X)[:}\forall \overline{\mathrm{X}}\left[\right.X¯[ let Γ Γ Gamma\GammaΓ in C 1 ] . T 1 C 1 . T 1 {:C_(1)].T_(1)\left.C_{1}\right] . \mathrm{T}_{1}C1].T1 and Γ Γ Gamma^(')\Gamma^{\prime}Γ stand for Γ ; z : σ Γ ; z : σ Gamma;z:sigma\Gamma ; \mathbf{z}: \sigmaΓ;z:σ. We have f p i ( Γ ) = f p i Γ = fpi(Gamma^('))=f p i\left(\Gamma^{\prime}\right)=fpi(Γ)= f p i ( f p i fpi(:}f p i\left(\right.fpi( let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 ) = C 2 = {:C_(2))=O/\left.C_{2}\right)=\varnothingC2)=. Thus, the induction hypothesis applies to (2) and yields let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 , Γ t 2 : T 2 C 2 , Γ t 2 : T 2 C_(2),Gamma^(')|--t_(2):T_(2)C_{2}, \Gamma^{\prime} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}C2,Γt2:T2 (6). Let us now weaken (5) and (6) so as to make them suitable premises for HMD-LETGEN. Applying Lemma 1.4.2 to (5) yields (let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 C 2 C_(2)C_{2}C2 ) ( ^^(:}\wedge\left(\right.( let Γ Γ Gamma\GammaΓ in C 1 C 1 C_(1)C_{1}C1 ), Γ t 1 : T 1 Γ t 1 : T 1 Gamma|--t_(1):T_(1)\Gamma \vdash \mathrm{t}_{1}: \mathrm{T}_{1}Γt1:T1 (7). Applying Lemma 1.4.2 to (6) yields (let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 C 2 C_(2)C_{2}C2 ) ¬ X ¯ ¬ X ¯ not EE bar(X)\neg \exists \overline{\mathrm{X}}¬X¯. (let Γ Γ Gamma\GammaΓ in C 1 C 1 C_(1)C_{1}C1 ), Γ t 2 : T 2 Γ t 2 : T 2 Gamma^(')|--t_(2):T_(2)\Gamma^{\prime} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}Γt2:T2 (8). Last, (3) implies X ¯ # f t v ( Γ X ¯ # f t v Γ bar(X)#ftv(Gamma:}\overline{\mathrm{X}} \# f t v\left(\Gamma\right.X¯#ftv(Γ, let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 C 2 C_(2)C_{2}C2 ) (9). Applying Hmd-LetGen to (7), (9) and (8), we obtain (let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 ) x ¯ C 2 x ¯ {:C_(2))^^EE bar(x)\left.C_{2}\right) \wedge \exists \overline{\mathrm{x}}C2)x¯. (let Γ Γ Gamma\GammaΓ in C 1 C 1 C_(1)C_{1}C1 ), Γ Γ Gamma|--\Gamma \vdashΓ let z = t 1 = t 1 =t_(1)=\mathrm{t}_{1}=t1 in t 2 : T 2 t 2 : T 2 t_(2):T_(2)\mathrm{t}_{2}: \mathrm{T}_{2}t2:T2 (10). Now, by (4), (3), and C-LetDup, let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 C 2 C_(2)C_{2}C2 is equivalent to let Γ ; z : X ¯ [ C 1 ] . T 1 Γ ; z : X ¯ C 1 . T 1 Gamma;z:AA bar(X)[C_(1)].T_(1)\Gamma ; \mathbf{z}: \forall \overline{\mathrm{X}}\left[C_{1}\right] . \mathrm{T}_{1}Γ;z:X¯[C1].T1 in C 2 C 2 C_(2)C_{2}C2. Using this fact, as well as (3), C-InEx, and C-InAnd, we find that the constraint (let Γ Γ Gamma^(')\Gamma^{\prime}Γ in C 2 C 2 C_(2)C_{2}C2 ) ¬ x ¯ ¬ x ¯ not EE bar(x)\neg \exists \overline{\mathrm{x}}¬x¯. (let Γ Γ Gamma\GammaΓ in C 1 C 1 C_(1)C_{1}C1 ) is equivalent to let Γ Γ Gamma\GammaΓ in (let z z z\mathrm{z}z : X ¯ [ C 1 ] . T 1 X ¯ C 1 . T 1 AA bar(X)[C_(1)].T_(1)\forall \overline{\mathrm{X}}\left[C_{1}\right] . \mathrm{T}_{1}X¯[C1].T1 in C 2 x ¯ . C 1 C 2 x ¯ . C 1 C_(2)^^EE bar(x).C_(1)C_{2} \wedge \exists \overline{\mathrm{x}} . C_{1}C2x¯.C1 ), which by definition of the let form, is itself equivalent to let Γ ; z : X ¯ [ C 1 ] . T 1 Γ ; z : X ¯ C 1 . T 1 Gamma;z:AA bar(X)[C_(1)].T_(1)\Gamma ; z: \forall \overline{\mathrm{X}}\left[C_{1}\right] . \mathrm{T}_{1}Γ;z:X¯[C1].T1 in C 2 C 2 C_(2)C_{2}C2. Last, by definition of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, this constraint is let Γ ; z : V [ C 1 ] Γ ; z : V C 1 Gamma;z:AAV[C_(1)]\Gamma ; \mathbf{z}: \forall \mathcal{V}\left[C_{1}\right]Γ;z:V[C1]. T 1 T 1 T_(1)\mathrm{T}_{1}T1 in C 2 C 2 C_(2)C_{2}C2. Thus, (10) is the goal.
  • Case Sub. The rule's conclusion is C T T t : T C T T t : T C^^T <= T^(')|--t:T^(')C \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}CTTt:T. Its premise is C t C t C|--tC \vdash \mathrm{t}Ct : T T T\mathrm{T}T (1). Applying the induction hypothesis to (1) yields let Γ Γ Gamma\GammaΓ in C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T (2). By Lemma 1.4.2 and HMD-Sub, (2) implies (let Γ Γ Gamma\GammaΓ in C C CCC ) T T , Γ t : T T T , Γ t : T ^^T <= T^('),Gamma|--t:T^(')\wedge \mathrm{T} \leq \mathrm{T}^{\prime}, \Gamma \vdash \mathrm{t}: \mathrm{T}^{\prime}TT,Γt:T, which by C-InAnd* may be written let Γ Γ Gamma\GammaΓ in ( C T T ) , Γ t : T C T T , Γ t : T (C^^T <= T^(')),Gamma|--t:T^(')\left(C \wedge \mathrm{T} \leq \mathrm{T}^{\prime}\right), \Gamma \vdash \mathrm{t}: \mathrm{T}^{\prime}(CTT),Γt:T.
@\circ Case ExIsTs. The rule's conclusion is x ¯ . C t x ¯ . C t EE bar(x).C|--t\exists \overline{\mathrm{x}} . C \vdash \mathrm{t}x¯.Ct : T. Its premises are C C C|--C \vdashC t : T t : T t:T\mathrm{t}: \mathrm{T}t:T (1) and X ¯ # f t v ( T ) X ¯ # f t v ( T ) bar(X)#ftv(T)\overline{\mathrm{X}} \# \mathrm{ftv}(\mathrm{T})X¯#ftv(T) (2). We may further require, w.l.o.g., X ¯ # f t v ( Γ ) X ¯ # f t v ( Γ ) bar(X)#ftv(Gamma)\overline{\mathrm{X}} \# \mathrm{ftv}(\Gamma)X¯#ftv(Γ) (3). Applying the induction hypothesis to (1) yields let Γ Γ Gamma\GammaΓ in C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T (4). Applying HmD-Exists to (2), (3), and (4), we find X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯.let Γ Γ Gamma\GammaΓ in C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T, which, by (3) and C-INEx, may be written let Γ Γ Gamma\GammaΓ in x ¯ . C , Γ t : T x ¯ . C , Γ t : T EE bar(x).C,Gamma|--t:T\exists \overline{\mathrm{x}} . C, \Gamma \vdash \mathrm{t}: \mathrm{T}x¯.C,Γt:T.
As a corollary, we find that, for closed programs, the type systems HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) and PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) coincide. In particular, a program is well-typed with respect to one if and only if it is well-typed with respect to the other. This supports the view that PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) is an alternate formulation of HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X).
1.5.5 Theorem: Assume f p i ( C ) = f p i ( C ) = fpi(C)=O/f p i(C)=\varnothingfpi(C)= and C C C≢C \not \equivC false. Then, C , t : T C , t : T C,O/|--t:TC, \varnothing \vdash \mathrm{t}: \mathrm{T}C,t:T holds in H M ( X ) H M ( X ) HM(X)\mathrm{HM}(X)HM(X) if and only if C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T holds in PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X).

1.6 Constraint generation

We now explain how to reduce type inference problems for PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) to constraint solving problems. A type inference problem consists of an expression
[ [ x : T ] ] = x T [ [ λ z t : T ] ] = x 1 X 2 ( let z : x 1 in [ [ t : x 2 ] ] X 1 x 2 T ) [ [ t 1 t 2 : T ] ] = x 2 ( [ [ t 1 : x 2 T ] ] [ [ t 2 : x 2 ] ] ) [ [ let z = t 1 in t 2 : T ] ] = let z : x [ [ [ t 1 : x ] ] ] X in [ [ t 2 : T ] ] [ [ x : T ] ] = x T [ [ λ z t : T ] ] = x 1 X 2  let  z : x 1  in  [ [ t : x 2 ] ] X 1 x 2 T [ [ t 1 t 2 : T ] ] = x 2 [ [ t 1 : x 2 T ] ] [ [ t 2 : x 2 ] ] [ [  let  z = t 1  in  t 2 : T ] ] =  let  z : x [ [ t 1 : x ] ] X  in  [ [ t 2 : T ] ] {:[[[x:T]]=x-<=T],[[[lambdaz*t:T]]=EEx_(1)X_(2)*(" let "z:x_(1)" in "([[)t:x_(2)(]])^^X_(1)rarrx_(2) <= T)],[[[t_(1)t_(2):T]]=EEx_(2)*(([[)t_(1):x_(2)rarrT(]])^^([[)t_(2):x_(2)(]]))],[[[" let "z=t_(1)" in "t_(2):T]]=" let "z:AAx[([[)t_(1):x(]])]*X" in "[[t_(2):T]]]:}\begin{aligned} \llbracket \mathrm{x}: \mathrm{T} \rrbracket & =\mathrm{x} \preceq \mathrm{T} \\ \llbracket \lambda \mathrm{z} \cdot \mathrm{t}: \mathrm{T} \rrbracket & =\exists \mathrm{x}_{1} \mathrm{X}_{2} \cdot\left(\text { let } \mathrm{z}: \mathrm{x}_{1} \text { in } \llbracket \mathrm{t}: \mathrm{x}_{2} \rrbracket \wedge \mathrm{X}_{1} \rightarrow \mathrm{x}_{2} \leq \mathrm{T}\right) \\ \llbracket \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T} \rrbracket & =\exists \mathrm{x}_{2} \cdot\left(\llbracket \mathrm{t}_{1}: \mathrm{x}_{2} \rightarrow \mathrm{T} \rrbracket \wedge \llbracket \mathrm{t}_{2}: \mathrm{x}_{2} \rrbracket\right) \\ \llbracket \text { let } \mathrm{z}=\mathrm{t}_{1} \text { in } \mathrm{t}_{2}: \mathrm{T} \rrbracket & =\text { let } \mathrm{z}: \forall \mathrm{x}\left[\llbracket \mathrm{t}_{1}: \mathrm{x} \rrbracket\right] \cdot \mathrm{X} \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \end{aligned}[[x:T]]=xT[[λzt:T]]=x1X2( let z:x1 in [[t:x2]]X1x2T)[[t1t2:T]]=x2([[t1:x2T]][[t2:x2]])[[ let z=t1 in t2:T]]= let z:x[[[t1:x]]]X in [[t2:T]]
Figure 1-10: Constraint generation
t t ttt and a type T T TTT of kind ***\star. The problem is to determine whether t t ttt is well-typed with type T T T\mathrm{T}T, that is, whether there exists a satisfiable constraint C C CCC such that C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T holds. This formulation of the problem may seem to require an appropriate type T T T\mathrm{T}T to be known in advance; this is not really the case, since T T T\mathrm{T}T may be a type variable. A constraint solving problem consists of a constraint C C CCC. The problem is to determine whether C C CCC is satisfiable. To reduce a type inference problem ( t , T ) t , T ) t,T)\mathrm{t}, \mathrm{T})t,T) to a constraint solving problem, we must produce a constraint C C CCC that is both sufficient and necessary for C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T to hold. Below, we explain how to compute such a constraint, which we write [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]]. We check that it is indeed sufficient by proving [ [ t : T ] ] t : T [ [ t : T ] ] t : T [[t:T]]|--t:T\llbracket \mathrm{t}: \mathrm{T} \rrbracket \vdash \mathrm{t}: \mathrm{T}[[t:T]]t:T. That is, the constraint [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] is specific enough to guarantee that t t ttt has type T T TTT. We say that constraint generation is sound. We check that it is indeed necessary by proving that, for every constraint C , C t : T C , C t : T C,C|--t:TC, C \vdash \mathrm{t}: \mathrm{T}C,Ct:T implies C [ [ t : T ] ] C [ [ t : T ] ] C⊩[[t:T]]C \Vdash \llbracket \mathrm{t}: \mathrm{T} \rrbracketC[[t:T]]. That is, every constraint that guarantees that t t t\mathrm{t}t has type T T T\mathrm{T}T is at least as specific as [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket \mathrm{t}: \mathrm{T} \rrbracket[[t:T]]. We say that constraint generation is complete. Together, these properties mean that [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket \mathrm{t}: \mathrm{T} \rrbracket[[t:T]] is the least specific constraint that guarantees that t t t\mathrm{t}t has type T T T\mathrm{T}T.
We now see how to reduce a type inference problem to a constraint solving problem. Indeed, if there exists a satisfiable constraint C C CCC such that C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T holds, then, by the completeness property, C [ [ t : T ] ] C [ [ t : T ] ] C⊩[[t:T]]C \Vdash \llbracket t: \mathrm{T} \rrbracketC[[t:T]] holds, so [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket \mathrm{t}: \mathrm{T} \rrbracket[[t:T]] is satisfiable. Conversely, the soundness property states that [ [ t : T ] ] t : T [ [ t : T ] ] t : T [[t:T]]|--t:T\llbracket \mathrm{t}: \mathrm{T} \rrbracket \vdash \mathrm{t}: \mathrm{T}[[t:T]]t:T holds, so, if [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket \mathrm{t}: \mathrm{T} \rrbracket[[t:T]] is satisfiable, then there exists a satisfiable constraint C C CCC such that C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T holds. In other words, t t t\mathrm{t}t is well-typed with type T T T\mathrm{T}T if and only if [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] is satisfiable.
The existence of such a constraint is the analogue of the existence of principal type schemes in classic presentations of ML-the-type-system (Damas and Milner, 1982). Indeed, a principal type scheme is least specific in the sense that all valid types are substitution instances of it. Here, the constraint [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] is least specific in the sense that all valid constraints entail it. Earlier, we have established a connection between constraint entailment and refinement of type substitutions, in the specific case of equality constraints interpreted over a free algebra of finite types; see Lemma 1.3.39.
The constraint [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] is defined in Figure 1-10 by induction on the structure of the expression t t ttt. We refer to these defining equations as the constraint generation rules. The definition is quite terse. It is perhaps even simpler than the declarative specification of PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) given in Figure 1-9; yet, we prove below that the two are equivalent.
Before explaining the definition, we state the requirements that bear on the type variables X 1 , X 2 X 1 , X 2 X_(1),X_(2)\mathrm{X}_{1}, \mathrm{X}_{2}X1,X2, and X X X\mathrm{X}X, which appear bound in the right-hand sides of the second, third, and fourth equations. These type variables must have kind ***\star. They must be chosen distinct (that is, X 1 X 2 X 1 X 2 X_(1)!=X_(2)\mathrm{X}_{1} \neq \mathrm{X}_{2}X1X2 in the second equation) and fresh in the following sense: type variables that appear bound in an equation's right-hand side must not appear free in the equation's left-hand side. Provided this restriction is obeyed, different choices of X 1 , X 2 X 1 , X 2 X_(1),X_(2)\mathrm{X}_{1}, \mathrm{X}_{2}X1,X2, and X X X\mathrm{X}X lead to α α alpha\alphaα-equivalent constraints - that is, to the same constraint, since we identify objects up to α α alpha\alphaα-conversion - which guarantees that the above equations make sense. We remark that, since expressions do not have free type variables, the freshness requirement may be simplified to: type variables that appear bound in an equation's right-hand side must not appear free in T. However, this simplification is rendered invalid by the introduction of type annotations within expressions (page 102). Please note that we are able to state a formal freshness requirement. This is made possible by the fact that [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] has no free type variables other than those of T T T\mathrm{T}T, which in turn depends on our explicit use of existential quantification.
Let us now review the four equations. The first one simply mirrors VAR. The second one requires t t ttt to have type x 2 x 2 x_(2)x_{2}x2 under the hypothesis that z z zzz has type X 1 X 1 X_(1)\mathrm{X}_{1}X1, and forms the arrow type X 1 X 2 X 1 X 2 X_(1)rarrX_(2)\mathrm{X}_{1} \rightarrow \mathrm{X}_{2}X1X2; this corresponds to ABs. Here, X 1 X 1 X_(1)\mathrm{X}_{1}X1 and X 2 X 2 X_(2)X_{2}X2 must be fresh type variables, because we cannot in general guess the expected types of z z zzz and t t ttt. The expected type T T T\mathrm{T}T is required to be a supertype of X 1 X 2 X 1 X 2 X_(1)rarrX_(2)\mathrm{X}_{1} \rightarrow \mathrm{X}_{2}X1X2; this corresponds to SUB. We must bind the fresh type variables X 1 X 1 X_(1)\mathrm{X}_{1}X1 and X 2 X 2 X_(2)\mathrm{X}_{2}X2, so as to guarantee that the generated constraint is unique up to α α alpha\alphaα-conversion. Furthermore, we must bind them existentially, because we intend the constraint solver to choose some appropriate value for them. This is justified by Exists. The third equation uses the fresh type variable X 2 X 2 X_(2)\mathrm{X}_{2}X2 to stand for the unknown type of t 2 t 2 t_(2)t_{2}t2. The subexpression t 1 t 1 t_(1)t_{1}t1 is expected to have type X 2 T X 2 T X_(2)rarrT\mathrm{X}_{2} \rightarrow \mathrm{T}X2T. This corresponds to APP. The fourth equation, which corresponds to LET, is most interesting. It summons a fresh type variable X X X\mathrm{X}X and produces [ [ t 1 : x ] ] [ [ t 1 : x ] ] [[t_(1):x]]\llbracket t_{1}: x \rrbracket[[t1:x]]. This constraint, whose sole free type variable is X X X\mathrm{X}X, is the least specific constraint that must be imposed on X X X\mathrm{X}X so as to make it a valid type for t 1 t 1 t_(1)\mathrm{t}_{1}t1. As a result, the type scheme x [ [ [ t 1 : x ] ] ] . X x [ [ t 1 : x ] ] . X AAx[([[)t_(1):x(]])].X\forall \mathrm{x}\left[\llbracket \mathrm{t}_{1}: \mathrm{x} \rrbracket\right] . \mathrm{X}x[[[t1:x]]].X, abbreviated σ σ sigma\sigmaσ in the following, is a principal type scheme for t 1 t 1 t_(1)t_{1}t1. There remains to place [ [ t 2 : T ] ] [ [ t 2 : T ] ] [[t_(2):T]]\llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket[[t2:T]] inside the context let z : σ z : σ z:sigma\mathrm{z}: \sigmaz:σ in \square. Indeed, when placed inside this context, an instantiation constraint of the form z T z T z-<=T^(')\mathrm{z} \preceq \mathrm{T}^{\prime}zT acquires the meaning
σ T σ T sigma-<=T^(')\sigma \preceq \mathrm{T}^{\prime}σT, which by definition of σ σ sigma\sigmaσ and by Lemma 1.6.4 (see below) is equivalent to [ [ t 1 : T ] ] [ [ t 1 : T ] ] [[t_(1):T^(')]]\llbracket t_{1}: T^{\prime} \rrbracket[[t1:T]]. Thus, the constraint produced by the fourth equation simulates a textual expansion of the let construct, whereby every occurrence of z z zzz would be replaced with t 1 t 1 t_(1)t_{1}t1. Thanks to type scheme introduction and instantiation constraints, however, this effect is achieved without duplication of source code or constraints. In other words, constraint generation has linear time and space complexity; duplication may take place during constraint solving only.
1.6.1 EXERCISE [ , ] [ , ] [***,↛][\star, \nrightarrow][,] : Define the size of an expression, of a type, and of a constraint, viewed as abstract syntax trees. Check that the size of [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] is linear in the sum of the sizes of t t ttt and T T TTT.
We now establish several properties of constraint generation. We begin with soundness, whose proof is straightforward.
1.6.2 Theorem [Soundness]: t : T ] ] t : T t : T ] ] t : T t:T]]|--t:T\mathrm{t}: \mathrm{T} \rrbracket \vdash \mathrm{t}: \mathrm{T}t:T]]t:T.
Proof: By induction on the structure of t t ttt.
  • Case x x x\mathrm{x}x. The goal x T x : T x T x : T x-<=T|--x:T\mathrm{x} \preceq \mathrm{T} \vdash \mathrm{x}: \mathrm{T}xTx:T follows from VAR.
  • Case λ λ lambda\lambdaλ z.t. By the induction hypothesis, we have [ [ t : x 2 ] ] t : x 2 [ [ t : x 2 ] ] t : x 2 [[t:x_(2)]]|--t:x_(2)\llbracket t: \mathrm{x}_{2} \rrbracket \vdash \mathrm{t}: \mathrm{x}_{2}[[t:x2]]t:x2. By ABs, this implies let z : x 1 z : x 1 z:x_(1)\mathrm{z}: \mathrm{x}_{1}z:x1 in [ [ t : x 2 ] ] λ z [ [ t : x 2 ] ] λ z [[t:x_(2)]]|--lambda z\llbracket \mathrm{t}: \mathrm{x}_{2} \rrbracket \vdash \lambda z[[t:x2]]λz. t : x 1 X 2 t : x 1 X 2 t:x_(1)rarrX_(2)\mathrm{t}: \mathrm{x}_{1} \rightarrow \mathrm{X}_{2}t:x1X2. By SuB, this implies let z : X 1 z : X 1 z:X_(1)\mathrm{z}: \mathrm{X}_{1}z:X1 in [ [ t : X 2 ] ] X 1 X 2 T λ z . t : T [ [ t : X 2 ] ] X 1 X 2 T λ z . t : T [[t:X_(2)]]^^X_(1)rarrX_(2) <= T|--lambdaz.t:T\llbracket \mathrm{t}: \mathrm{X}_{2} \rrbracket \wedge \mathrm{X}_{1} \rightarrow \mathrm{X}_{2} \leq \mathrm{T} \vdash \lambda \mathrm{z} . \mathrm{t}: \mathrm{T}[[t:X2]]X1X2Tλz.t:T. Lastly, because X 1 X 2 # ftv ( T ) X 1 X 2 # ftv ( T ) X_(1)X_(2)#ftv(T)\mathrm{X}_{1} \mathrm{X}_{2} \# \operatorname{ftv}(\mathrm{T})X1X2#ftv(T) holds, ExISTS applies and yields [ [ λ z . t : T ] ] λ z . t : T [ [ λ z . t : T ] ] λ z . t : T [[lambda z.t:T]]|--lambda z.t:T\llbracket \lambda z . t: \mathrm{T} \rrbracket \vdash \lambda z . t: \mathrm{T}[[λz.t:T]]λz.t:T.
  • Case t 1 t 2 t 1 t 2 t_(1)t_(2)\mathrm{t}_{1} \mathrm{t}_{2}t1t2. By the induction hypothesis, we have [ [ t 1 : x 2 T ] ] t 1 [ [ t 1 : x 2 T ] ] t 1 [[t_(1):x_(2)rarrT]]|--t_(1)\llbracket \mathrm{t}_{1}: \mathrm{x}_{2} \rightarrow \mathrm{T} \rrbracket \vdash \mathrm{t}_{1}[[t1:x2T]]t1 : x 2 T x 2 T x_(2)rarrT\mathrm{x}_{2} \rightarrow \mathrm{T}x2T and [ [ t 2 : x 2 ] ] t 2 : x 2 [ [ t 2 : x 2 ] ] t 2 : x 2 [[t_(2):x_(2)]]|--t_(2):x_(2)\llbracket \mathrm{t}_{2}: \mathrm{x}_{2} \rrbracket \vdash \mathrm{t}_{2}: \mathrm{x}_{2}[[t2:x2]]t2:x2. By APP, this implies [ [ t 1 : x 2 T ] ] [ [ t 1 : x 2 T ] ] [[t_(1):x_(2)rarrT]]^^\llbracket \mathrm{t}_{1}: \mathrm{x}_{2} \rightarrow \mathrm{T} \rrbracket \wedge[[t1:x2T]] [ [ t 2 : X 2 ] ] t 1 t 2 : T [ [ t 2 : X 2 ] ] t 1 t 2 : T [[t_(2):X_(2)]]|--t_(1)t_(2):T\llbracket \mathrm{t}_{2}: \mathrm{X}_{2} \rrbracket \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}[[t2:X2]]t1t2:T. Because X 2 f t v ( T ) X 2 f t v ( T ) X_(2)!in ftv(T)\mathrm{X}_{2} \notin f t v(\mathrm{~T})X2ftv( T) holds, ExISTs applies and yields [ [ t 1 t 2 : T ] ] t 1 t 2 : T [ [ t 1 t 2 : T ] ] t 1 t 2 : T [[t_(1)t_(2):T]]|--t_(1)t_(2):T\llbracket \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T} \rrbracket \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}[[t1t2:T]]t1t2:T.
  • Case let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 t 2 t_(2)\mathrm{t}_{2}t2. By the induction hypothesis, we have [ [ t 1 [ [ t 1 [[t_(1)\llbracket \mathrm{t}_{1}[[t1 : x ] ] t 1 : x x ] ] t 1 : x x]]|--t_(1):x\mathrm{x} \rrbracket \vdash \mathrm{t}_{1}: \mathrm{x}x]]t1:x and [ [ t 2 : T ] ] t 2 : [ [ t 2 : T ] ] t 2 : [[t_(2):T]]|--t_(2):\llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \vdash \mathrm{t}_{2}:[[t2:T]]t2: T. By LET, these imply let z : V [ [ [ t 1 : z : V [ [ t 1 : z:AAV[([[)t_(1)::}\mathrm{z}: \forall \mathcal{V}\left[\llbracket \mathrm{t}_{1}:\right.z:V[[[t1: x ] ] ] . X x ] ] ] . X x]]].X\mathrm{x} \rrbracket] . \mathrm{X}x]]].X in [ [ t 2 : T ] ] [ [ t 2 : T ] ] [[t_(2):T]]|--\llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \vdash[[t2:T]] let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 : T t 2 : T t_(2):T\mathrm{t}_{2}: \mathrm{T}t2:T. Because f t v ( [ [ t 1 : X ] ] ) f t v [ [ t 1 : X ] ] ftv(([[)t_(1):X(]]))\mathrm{ftv}\left(\llbracket \mathrm{t}_{1}: \mathrm{X} \rrbracket\right)ftv([[t1:X]]) is X X X\mathrm{X}X, the universal quantification on V V V\mathcal{V}V really bears on X X XXX alone. We have proved [ [ l [ [ l [[l\llbracket l[[l et z = t 1 z = t 1 z=t_(1)z=t_{1}z=t1 in t 2 t 2 t_(2)t_{2}t2 : T ] ] T ] ] T]]|--\mathrm{T} \rrbracket \vdashT]] let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 : T t 2 : T t_(2):T\mathrm{t}_{2}: \mathrm{T}t2:T.
The following lemmas are used in the proof of the completeness property and in a number of other occasions. The first two state that [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] is covariant with respect to T T T\mathrm{T}T. Roughly speaking, this means that enough subtyping constraints are generated to achieve completeness with respect to SUB.
1.6.3 Lemma: [ [ t : T ] ] T T [ [ t : T ] ] T T [[t:T]]^^T <= T^(')\llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{T}^{\prime}[[t:T]]TT entails [ [ t : T ] ] [ [ t : T ] ] [[t:T^(')]]\llbracket \mathrm{t}: \mathrm{T}^{\prime} \rrbracket[[t:T]].
1.6.4 Lemma: X f t v ( T ) X f t v ( T ) X!in ftv(T)\mathrm{X} \notin f t v(\mathrm{~T})Xftv( T) implies X . ( [ [ t : X ] ] X T ) [ [ t : T ] ] X . ( [ [ t : X ] ] X T ) [ [ t : T ] ] EEX.([[t:X]]^^X <= T)-=[[t:T]]\exists \mathrm{X} .(\llbracket \mathrm{t}: \mathrm{X} \rrbracket \wedge \mathrm{X} \leq \mathrm{T}) \equiv \llbracket \mathrm{t}: \mathrm{T} \rrbracketX.([[t:X]]XT)[[t:T]].
The next lemma gives a simplified version of the second constraint generation rule, in the specific case where the expected type is an arrow type. Then, fresh type variables need not be generated; one may directly use the arrow's domain and codomain instead.
Lemma: [ [ λ [ [ λ [[lambda\llbracket \lambda[[λ z.t : T 1 T 2 ] ] T 1 T 2 ] ] T_(1)rarrT_(2)]]\mathrm{T}_{1} \rightarrow \mathrm{T}_{2} \rrbracketT1T2]] is equivalent to let z : T 1 z : T 1 z:T_(1)\mathrm{z}: \mathrm{T}_{1}z:T1 in [ [ t : T 2 ] ] [ [ t : T 2 ] ] [[t:T_(2)]]\llbracket \mathrm{t}: \mathrm{T}_{2} \rrbracket[[t:T2]].
We conclude with the completeness property.
1.6.6 Theorem [Completeness]: if C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T, then C [ [ t : T ] ] C [ [ t : T ] ] C⊩[[t:T]]C \Vdash \llbracket \mathrm{t}: \mathrm{T} \rrbracketC[[t:T]].
Proof: By induction on the derivation of C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T.
๑ Case VAR. The rule's conclusion is C x C x C|--xC \vdash \mathrm{x}Cx : T. Its premise is C x T C x T C⊩x-<=TC \Vdash \mathrm{x} \preceq \mathrm{T}CxT, which is also the goal.
  • Case ABs. The rule's conclusion is let z : T T T\mathrm{T}T in C λ z . t : T T C λ z . t : T T C|--lambda z.t:TrarrT^(')C \vdash \lambda z . t: \mathrm{T} \rightarrow \mathrm{T}^{\prime}Cλz.t:TT. Its premise is C t : T C t : T C|--t:T^(')C \vdash \mathrm{t}: \mathrm{T}^{\prime}Ct:T. By the induction hypothesis, we have C [ [ t : T ] ] C [ [ t : T ] ] C⊩[[t:T^(')]]C \Vdash \llbracket \mathrm{t}: \mathrm{T}^{\prime} \rrbracketC[[t:T]]. By congruence of entailment, this implies let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C C C⊩C \VdashC let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in [ [ t : T ] ] [ [ t : T ] ] [[t:T^(')]]\llbracket \mathrm{t}: \mathrm{T}^{\prime} \rrbracket[[t:T]], which, by Lemma 1.6.5, may be written let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C [ [ λ z . t : T T ] ] C [ [ λ z . t : T T ] ] C⊩[[lambda z.t:TrarrT^(')]]C \Vdash \llbracket \lambda z . \mathrm{t}: \mathrm{T} \rightarrow \mathrm{T}^{\prime} \rrbracketC[[λz.t:TT]].
  • Case App. The rule's conclusion is C 1 C 2 t 1 t 2 : T C 1 C 2 t 1 t 2 : T C_(1)^^C_(2)|--t_(1)t_(2):T^(')C_{1} \wedge C_{2} \vdash \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime}C1C2t1t2:T. Its premises are C 1 t 1 : T T C 1 t 1 : T T C_(1)|--t_(1):TrarrT^(')C_{1} \vdash \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime}C1t1:TT and C 2 t 2 : T C 2 t 2 : T C_(2)|--t_(2):TC_{2} \vdash \mathrm{t}_{2}: \mathrm{T}C2t2:T. By the induction hypothesis, we have C 1 C 1 C_(1)⊩C_{1} \VdashC1 [ [ t 1 : T T ] ] [ [ t 1 : T T ] ] [[t_(1):TrarrT^(')]]\llbracket \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime} \rrbracket[[t1:TT]] and C 2 [ [ t 2 : T ] ] C 2 [ [ t 2 : T ] ] C_(2)⊩[[t_(2):T]]C_{2} \Vdash \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracketC2[[t2:T]]. Thus, C 1 C 2 C 1 C 2 C_(1)^^C_(2)C_{1} \wedge C_{2}C1C2 entails [ [ t 1 : T T ] ] [ [ t 2 : T ] ] [ [ t 1 : T T ] ] [ [ t 2 : T ] ] [[t_(1):TrarrT^(')]]^^[[t_(2):T]]\llbracket \mathrm{t}_{1}: \mathrm{T} \rightarrow \mathrm{T}^{\prime} \rrbracket \wedge \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket[[t1:TT]][[t2:T]], which, by C-NAMEEQ, may be written X 2 ( X 2 = T [ [ t 1 : X 2 T ] ] [ [ t 2 : X 2 ] ] ) X 2 X 2 = T [ [ t 1 : X 2 T ] ] [ [ t 2 : X 2 ] ] EEX_(2)*(X_(2)=T^^([[)t_(1):X_(2)rarrT^(')(]])^^([[)t_(2):X_(2)(]]))\exists \mathrm{X}_{2} \cdot\left(\mathrm{X}_{2}=\mathrm{T} \wedge \llbracket \mathrm{t}_{1}: \mathrm{X}_{2} \rightarrow \mathrm{T}^{\prime} \rrbracket \wedge \llbracket \mathrm{t}_{2}: \mathrm{X}_{2} \rrbracket\right)X2(X2=T[[t1:X2T]][[t2:X2]]), where X 2 f t v ( T , T ) X 2 f t v T , T X_(2)!in ftv((T),T^('))\mathrm{X}_{2} \notin f t v\left(\mathrm{~T}, \mathrm{~T}^{\prime}\right)X2ftv( T, T). Forgetting about the equation X 2 = T X 2 = T X_(2)=T\mathrm{X}_{2}=\mathrm{T}X2=T, we find that C 1 C 2 C 1 C 2 C_(1)^^C_(2)C_{1} \wedge C_{2}C1C2 entails X 2 ( [ [ t 1 : X 2 T ] ] [ [ t 2 : X 2 ] ] ) X 2 [ [ t 1 : X 2 T ] ] [ [ t 2 : X 2 ] ] EEX_(2)*(([[)t_(1):X_(2)rarrT^(')(]])^^([[)t_(2):X_(2)(]]))\exists \mathrm{X}_{2} \cdot\left(\llbracket \mathrm{t}_{1}: \mathrm{X}_{2} \rightarrow \mathrm{T}^{\prime} \rrbracket \wedge \llbracket \mathrm{t}_{2}: \mathrm{X}_{2} \rrbracket\right)X2([[t1:X2T]][[t2:X2]]), which is precisely [ [ t 1 t 2 : T ] ] [ [ t 1 t 2 : T ] ] [[t_(1)t_(2):T^(')]]\llbracket \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime} \rrbracket[[t1t2:T]].
  • Case Let. The rule's conclusion is let z : V [ C 1 ] . T 1 z : V C 1 . T 1 z:AAV[C_(1)].T_(1)\mathrm{z}: \forall \mathcal{V}\left[C_{1}\right] . \mathrm{T}_{1}z:V[C1].T1 in C 2 C 2 C_(2)|--C_{2} \vdashC2 let z = z = z=\mathrm{z}=z= t 1 t 1 t_(1)\mathrm{t}_{1}t1 in t 2 : T 2 t 2 : T 2 t_(2):T_(2)\mathrm{t}_{2}: \mathrm{T}_{2}t2:T2. Its premises are C 1 t 1 : T 1 C 1 t 1 : T 1 C_(1)|--t_(1):T_(1)C_{1} \vdash \mathrm{t}_{1}: \mathrm{T}_{1}C1t1:T1 and C 2 t 2 : T 2 C 2 t 2 : T 2 C_(2)|--t_(2):T_(2)C_{2} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}C2t2:T2. By the induction hypothesis, we have C 1 [ [ t 1 : T 1 ] ] C 1 [ [ t 1 : T 1 ] ] C_(1)⊩[[t_(1):T_(1)]]C_{1} \Vdash \llbracket \mathrm{t}_{1}: \mathrm{T}_{1} \rrbracketC1[[t1:T1]] and C 2 [ [ t 2 : T 2 ] ] C 2 [ [ t 2 : T 2 ] ] C_(2)⊩[[t_(2):T_(2)]]C_{2} \Vdash \llbracket \mathrm{t}_{2}: \mathrm{T}_{2} \rrbracketC2[[t2:T2]], which implies let z : V [ C 1 ] . T 1 z : V C 1 . T 1 z:AAV[C_(1)].T_(1)\mathrm{z}: \forall \mathcal{V}\left[C_{1}\right] . \mathrm{T}_{1}z:V[C1].T1 in C 2 C 2 C_(2)⊩C_{2} \VdashC2 let z : V [ [ [ t 1 : T 1 ] ] ] . T 1 z : V [ [ t 1 : T 1 ] ] . T 1 z:AAV[([[)t_(1):T_(1)(]])].T_(1)\mathrm{z}: \forall \mathcal{V}\left[\llbracket \mathrm{t}_{1}: \mathrm{T}_{1} \rrbracket\right] . \mathrm{T}_{1}z:V[[[t1:T1]]].T1 in [ [ t 2 : T 2 ] ] [ [ t 2 : T 2 ] ] [[t_(2):T_(2)]]\llbracket \mathrm{t}_{2}: \mathrm{T}_{2} \rrbracket[[t2:T2]] (1).
Now, let us establish true x [ [ [ t 1 : x ] ] ] . x V [ [ [ t 1 : T 1 ] ] ] . T 1 x [ [ t 1 : x ] ] . x V [ [ t 1 : T 1 ] ] . T 1 ⊩AAx[([[)t_(1):x(]])].x-<=AAV[([[)t_(1):T_(1)(]])].T_(1)\Vdash \forall \mathrm{x}\left[\llbracket \mathrm{t}_{1}: \mathrm{x} \rrbracket\right] . \mathrm{x} \preceq \forall \mathcal{V}\left[\llbracket \mathrm{t}_{1}: \mathrm{T}_{1} \rrbracket\right] . \mathrm{T}_{1}x[[[t1:x]]].xV[[[t1:T1]]].T1 (2). By definition, this requires proving X ¯ 1 ( [ [ t 1 : T 1 ] ] T 1 Z ) X . ( [ [ t 1 : x ] ] X X ¯ 1 [ [ t 1 : T 1 ] ] T 1 Z X . [ [ t 1 : x ] ] X EE bar(X)_(1)*(([[)t_(1):T_(1)(]])^^T_(1) <= Z)⊩EEX.(([[)t_(1):x(]])^^X <= :}\exists \overline{\mathrm{X}}_{1} \cdot\left(\llbracket \mathrm{t}_{1}: \mathrm{T}_{1} \rrbracket \wedge \mathrm{T}_{1} \leq \mathrm{Z}\right) \Vdash \exists \mathrm{X} .\left(\llbracket \mathrm{t}_{1}: \mathrm{x} \rrbracket \wedge \mathrm{X} \leq\right.X¯1([[t1:T1]]T1Z)X.([[t1:x]]X Z) (3), where X ¯ 1 = f t v ( T 1 ) X ¯ 1 = f t v T 1 bar(X)_(1)=ftv(T_(1))\overline{\mathrm{X}}_{1}=f t v\left(\mathrm{~T}_{1}\right)X¯1=ftv( T1) and Z X X ¯ 1 Z X X ¯ 1 Z!inX bar(X)_(1)\mathrm{Z} \notin \mathrm{X} \overline{\mathrm{X}}_{1}ZXX¯1 (4). By Lemma 1.6.3, (4), and CEx*, the left-hand side of (3) entails [ [ t 1 : z ] ] [ [ t 1 : z ] ] [[t_(1):z]]\llbracket t_{1}: \mathrm{z} \rrbracket[[t1:z]]. By (4) and Lemma 1.6.4, the right-hand side of (3) is [ [ t 1 : z ] ] [ [ t 1 : z ] ] [[t_(1):z]]\llbracket t_{1}: z \rrbracket[[t1:z]]. Thus, (3) holds, and so does (2).
By ( 2 ) ( 2 ) (2)(2)(2) and Lemma 1.3.22, we have let z : V [ [ [ t 1 : T 1 ] ] ] . T 1 z : V [ [ t 1 : T 1 ] ] . T 1 z:AAV[([[)t_(1):T_(1)(]])].T_(1)\mathrm{z}: \forall \mathcal{V}\left[\llbracket \mathrm{t}_{1}: \mathrm{T}_{1} \rrbracket\right] . \mathrm{T}_{1}z:V[[[t1:T1]]].T1 in [ [ t 2 : T 2 ] ] [ [ t 2 : T 2 ] ] [[t_(2):T_(2)]]⊩\llbracket \mathrm{t}_{2}: \mathrm{T}_{2} \rrbracket \Vdash[[t2:T2]] let z : X [ [ [ t 1 : X ] ] ] X z : X [ [ t 1 : X ] ] X z:AAX[([[)t_(1):X(]])]*X\mathrm{z}: \forall \mathrm{X}\left[\llbracket \mathrm{t}_{1}: \mathrm{X} \rrbracket\right] \cdot \mathrm{X}z:X[[[t1:X]]]X in [ [ t 2 : T 2 ] ] [ [ t 2 : T 2 ] ] [[t_(2):T_(2)]]\llbracket \mathrm{t}_{2}: \mathrm{T}_{2} \rrbracket[[t2:T2]] (5). By transitivity of entailment, (1) and (5) yield let z : V [ C 1 ] . T 1 z : V C 1 . T 1 z:AAV[C_(1)].T_(1)\mathrm{z}: \forall \mathcal{V}\left[C_{1}\right] . \mathrm{T}_{1}z:V[C1].T1 in C 2 [ [ C 2 [ [ C_(2)⊩[[C_{2} \Vdash \llbracketC2[[ let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 : T 2 ] ] t 2 : T 2 ] ] t_(2):T_(2)]]\mathrm{t}_{2}: \mathrm{T}_{2} \rrbrackett2:T2]].
  • Case Sub. The rule's conclusion is C T T t : T C T T t : T C^^T <= T^(')|--t:T^(')C \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \vdash \mathrm{t}: \mathrm{T}^{\prime}CTTt:T. Its premise is C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T. By the induction hypothesis, we have C [ [ t : T ] ] C [ [ t : T ] ] C⊩[[t:T]]C \Vdash \llbracket \mathrm{t}: \mathrm{T} \rrbracketC[[t:T]], which implies C T T [ [ t : T ] ] T T C T T [ [ t : T ] ] T T C^^T <= T^(')⊩[[t:T]]^^T <= T^(')C \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \Vdash \llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{T}^{\prime}CTT[[t:T]]TT. By lemma 1.6.3 and by transitivity of entailment, we obtain C T T [ [ t : T ] ] C T T [ [ t : T ] ] C^^T <= T^(')⊩[[t:T^(')]]C \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \Vdash \llbracket \mathrm{t}: \mathrm{T}^{\prime} \rrbracketCTT[[t:T]].
@\circ Case Exists. The rule's conclusion is x ¯ . C t x ¯ . C t EE bar(x).C|--t\exists \overline{\mathrm{x}} . C \vdash \mathrm{t}x¯.Ct : T T T\mathrm{T}T. Its premises are C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T and x ¯ # f t v ( T ) x ¯ # f t v ( T ) bar(x)#ftv(T)\overline{\mathrm{x}} \# f t v(\mathrm{~T})x¯#ftv( T) (1). By the induction hypothesis, we have C [ [ t : T ] ] C [ [ t : T ] ] C⊩[[t:T]]C \Vdash \llbracket \mathrm{t}: \mathrm{T} \rrbracketC[[t:T]].
By congruence of entailment, this implies X ¯ . C X ¯ X ¯ . C X ¯ EE bar(X).C⊩EE bar(X)\exists \overline{\mathrm{X}} . C \Vdash \exists \overline{\mathrm{X}}X¯.CX¯. t : T ] ] t : T ] ] t:T]]\mathrm{t}: \mathrm{T} \rrbrackett:T]] (2). Furthermore, (1) implies X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ # ftv( [ [ t : T ] ] ) [ [ t : T ] ] ) [[t:T]])\llbracket \mathrm{t}: \mathrm{T} \rrbracket)[[t:T]]) (3). By (3) and C-Ex*, (2) may be written x ¯ . C [ [ t : T ] ] x ¯ . C [ [ t : T ] ] EE bar(x).C⊩[[t:T]]\exists \overline{\mathrm{x}} . C \Vdash \llbracket \mathrm{t}: \mathrm{T} \rrbracketx¯.C[[t:T]].

1.7 Type soundness

We are now ready to establish type soundness for our type system. The statement that we wish to prove is sometimes known as Milner's slogan: well-typed programs do not go wrong (Milner, 1978). Below, we define well-typedness in terms of our constraint generation rules, for the sake of convenience, and establish type soundness with respect to that particular definition. Theorems 1.4.7, 1.5.4, and 1.6.6 imply that type soundness also holds when well-typedness is defined with respect to the typing judgements of D M , HM ( X ) D M , HM ( X ) DM,HM(X)\mathrm{DM}, \operatorname{HM}(X)DM,HM(X), or P C B ( X ) P C B ( X ) PCB(X)\mathrm{PCB}(X)PCB(X). We establish type soundness by following Wright and Felleisen's so-called syntactic approach (1994b). The approach consists in isolating two independent properties. Subject reduction, whose exact statement will be given below, implies that well-typedness is preserved by reduction. Progress states that no stuck configuration is well-typed. It is immediate to check that, if both properties hold, then no well-typed program can reduce to a stuck configuration. Subject reduction itself depends on a key lemma, usually known as a (term) substitution lemma. We immediately give two versions of this lemma: the former is stated in terms of PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X) judgements, while the latter is stated in terms of the constraint generation rules.
1.7.1 Lemma [Substitution]: C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T and C 0 t 0 : T 0 C 0 t 0 : T 0 C_(0)|--t_(0):T_(0)C_{0} \vdash \mathrm{t}_{0}: \mathrm{T}_{0}C0t0:T0 imply let z 0 z 0 z_(0)\mathrm{z}_{0}z0 : x ¯ 0 [ C 0 ] . T 0 x ¯ 0 C 0 . T 0 AA bar(x)_(0)[C_(0)].T_(0)\forall \overline{\mathrm{x}}_{0}\left[C_{0}\right] . \mathrm{T}_{0}x¯0[C0].T0 in C [ z 0 t 0 ] t : T C z 0 t 0 t : T C|--[z_(0)|->t_(0)]t:TC \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}: \mathrm{T}C[z0t0]t:T.
Proof: The proof is by structural induction on the derivation of C t : T C t : T C|--t:TC \vdash \mathrm{t}: \mathrm{T}Ct:T. In each proof case, we adopt the notations of Figure 1-9. We write σ 0 σ 0 sigma_(0)\sigma_{0}σ0 for x ¯ 0 [ C 0 ] . T 0 x ¯ 0 C 0 . T 0 AA bar(x)_(0)[C_(0)].T_(0)\forall \overline{\mathrm{x}}_{0}\left[C_{0}\right] . \mathrm{T}_{0}x¯0[C0].T0. We refer to the hypothesis C 0 t 0 : T 0 C 0 t 0 : T 0 C_(0)|--t_(0):T_(0)C_{0} \vdash \mathrm{t}_{0}: \mathrm{T}_{0}C0t0:T0 as (1). We assume, w.l.o.g., X ¯ 0 # ftv ( C , T ) ( 2 ) X ¯ 0 # ftv ( C , T ) ( 2 ) bar(X)_(0)#ftv(C,T)(2)\overline{\mathrm{X}}_{0} \# \operatorname{ftv}(C, \mathrm{~T})(\mathbf{2})X¯0#ftv(C, T)(2) and z 0 f p i ( σ 0 ) z 0 f p i σ 0 z_(0)!in fpi(sigma_(0))\mathrm{z}_{0} \notin f p i\left(\sigma_{0}\right)z0fpi(σ0) (3).
○ Case VAR. The rule's conclusion is C x : T C x : T C|--x:TC \vdash \mathrm{x}: \mathrm{T}Cx:T (4). Its premise is C x C x C⊩x-<=C \Vdash \mathrm{x} \preceqCx T T T\mathrm{T}T (5). Two subcases arise.
Subcase x x x\mathrm{x}x is z 0 z 0 z_(0)\mathrm{z}_{0}z0. Applying Sub to (1) yields C 0 T 0 T t 0 C 0 T 0 T t 0 C_(0)^^T_(0) <= T|--t_(0)C_{0} \wedge \mathrm{T}_{0} \leq \mathrm{T} \vdash \mathrm{t}_{0}C0T0Tt0 : T. By (2) and ExISTS, this implies X ¯ 0 ( C 0 T 0 T ) t 0 : T X ¯ 0 C 0 T 0 T t 0 : T EE bar(X)_(0)*(C_(0)^^T_(0) <= T)|--t_(0):T\exists \overline{\mathrm{X}}_{0} \cdot\left(C_{0} \wedge \mathrm{T}_{0} \leq \mathrm{T}\right) \vdash \mathrm{t}_{0}: \mathrm{T}X¯0(C0T0T)t0:T (6). Furthermore, by (2) again, the constraint X ¯ 0 ( C 0 T 0 T ) X ¯ 0 C 0 T 0 T EE bar(X)_(0)*(C_(0)^^T_(0) <= T)\exists \overline{\mathrm{X}}_{0} \cdot\left(C_{0} \wedge \mathrm{T}_{0} \leq \mathrm{T}\right)X¯0(C0T0T) is σ 0 T σ 0 T sigma_(0)-<=T\sigma_{0} \preceq \mathrm{T}σ0T, which is equivalent to let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in z 0 T z 0 T z_(0)-<=T\mathrm{z}_{0} \preceq \mathrm{T}z0T. As a result, (6) may be written let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in x T x T x-<=T|--\mathrm{x} \preceq \mathrm{T} \vdashxT [ z 0 t 0 ] x : T ( 7 ) z 0 t 0 x : T ( 7 ) [z_(0)|->t_(0)]x:T(7)\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{x}: \mathrm{T}(\mathbf{7})[z0t0]x:T(7).
Subcase x x x\mathrm{x}x isn't z 0 z 0 z_(0)\mathrm{z}_{0}z0. Then, [ z 0 t 0 ] x z 0 t 0 x [z_(0)|->t_(0)]x\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{x}[z0t0]x is x x x\mathrm{x}x. Thus, VAR yields σ 0 x T σ 0 x T EEsigma_(0)^^x-<=T|--\exists \sigma_{0} \wedge \mathrm{x} \preceq \mathrm{T} \vdashσ0xT [ z 0 t 0 ] x : T z 0 t 0 x : T [z_(0)|->t_(0)]x:T\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{x}: \mathrm{T}[z0t0]x:T. By C-IN*, this may be read let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in x T [ z 0 t 0 ] x x T z 0 t 0 x x-<=T|--[z_(0)|->t_(0)]x\mathrm{x} \preceq \mathrm{T} \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{x}xT[z0t0]x : T T T\mathrm{T}T, that is, again (7).
In either subcase, by (5), by congruence of entailment, and by Lemma 1.5.2, (7) implies let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C [ z 0 t 0 ] t : T C z 0 t 0 t : T C|--[z_(0)|->t_(0)]t:TC \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}: \mathrm{T}C[z0t0]t:T.
  • Case ABs. The rule's conclusion is let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C λ C λ C|--lambdaC \vdash \lambdaCλ z.t : T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT. Its premise is C t : T ( 8 ) C t : T ( 8 ) C|--t:T^(')(8)C \vdash \mathrm{t}: \mathrm{T}^{\prime}(\mathbf{8})Ct:T(8). We may assume, w.l.o.g., that z z z\mathrm{z}z is distinct from z 0 z 0 z_(0)\mathrm{z}_{0}z0 and does not occur free within t 0 t 0 t_(0)t_{0}t0 or σ 0 ( 9 ) σ 0 ( 9 ) sigma_(0)(9)\sigma_{0}(9)σ0(9). Applying the induction hypothesis to (8) yields let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C [ z 0 t 0 ] t : T C z 0 t 0 t : T C|--[z_(0)|->t_(0)]t:T^(')C \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}: \mathrm{T}^{\prime}C[z0t0]t:T, which, by ABs, implies let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in (let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C ) λ z C λ z {:C)|--lambdaz\left.C\right) \vdash \lambda \mathrm{z}C)λz. [ z 0 t 0 ] t : T T z 0 t 0 t : T T [z_(0)|->t_(0)]t:TrarrT^(')\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}: \mathrm{T} \rightarrow \mathrm{T}^{\prime}[z0t0]t:TT. By (9) and C-LETLet, this may be written let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in (let z : T z : T z:T\mathrm{z}: \mathrm{T}z:T in C ) [ z 0 t 0 ] ( λ z . t ) : T T C z 0 t 0 ( λ z . t ) : T T {:C)|--[z_(0)|->t_(0)](lambdaz.t):TrarrT^(')\left.C\right) \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right](\lambda \mathrm{z} . \mathrm{t}): \mathrm{T} \rightarrow \mathrm{T}^{\prime}C)[z0t0](λz.t):TT.
  • Case APp. By the induction hypothesis, by APP, and by C-InAnd.
  • Case LET. The rule's conclusion is let z : x ¯ 1 [ C 1 ] . T 1 z : x ¯ 1 C 1 . T 1 z:AA bar(x)_(1)[C_(1)].T_(1)\mathrm{z}: \forall \overline{\mathrm{x}}_{1}\left[C_{1}\right] . \mathrm{T}_{1}z:x¯1[C1].T1 in C 2 C 2 C_(2)|--C_{2} \vdashC2 let z = z = z=\mathrm{z}=z= t 1 t 1 t_(1)\mathrm{t}_{1}t1 in t 2 : T 2 t 2 : T 2 t_(2):T_(2)\mathrm{t}_{2}: \mathrm{T}_{2}t2:T2, where x ¯ 1 = ftv ( C 1 , T 1 ) x ¯ 1 = ftv C 1 , T 1 bar(x)_(1)=ftv(C_(1),T_(1))\overline{\mathrm{x}}_{1}=\operatorname{ftv}\left(C_{1}, \mathrm{~T}_{1}\right)x¯1=ftv(C1, T1). Its premises are C 1 t 1 : T 1 ( 1 0 ) C 1 t 1 : T 1 ( 1 0 ) C_(1)|--t_(1):T_(1)(10)C_{1} \vdash \mathrm{t}_{1}: \mathrm{T}_{1}(\mathbf{1 0})C1t1:T1(10) and C 2 t 2 : T 2 C 2 t 2 : T 2 C_(2)|--t_(2):T_(2)C_{2} \vdash \mathrm{t}_{2}: \mathrm{T}_{2}C2t2:T2 (11). We may assume, w.l.o.g., that z z z\mathrm{z}z is distinct from z 0 z 0 z_(0)\mathrm{z}_{0}z0 and does not occur free within t 0 t 0 t_(0)\mathrm{t}_{0}t0 or σ 0 σ 0 sigma_(0)\sigma_{0}σ0 (12). We may also assume, w.l.o.g., x ¯ 1 # x ¯ 1 # bar(x)_(1)#\overline{\mathrm{x}}_{1} \#x¯1# f t v ( σ 0 ) f t v σ 0 ftv(sigma_(0))f t v\left(\sigma_{0}\right)ftv(σ0) (13). Applying the induction hypothesis to (10) and (11) respectively yields let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C 1 [ z 0 t 0 ] t 1 : T 1 ( 1 4 ) C 1 z 0 t 0 t 1 : T 1 ( 1 4 ) C_(1)|--[z_(0)|->t_(0)]t_(1):T_(1)(14)C_{1} \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}_{1}: \mathrm{T}_{1}(\mathbf{1 4})C1[z0t0]t1:T1(14) and let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C 2 [ z 0 C 2 z 0 C_(2)|--[z_(0)|->:}C_{2} \vdash\left[\mathrm{z}_{0} \mapsto\right.C2[z0 t 0 ] t 2 : T 2 t 0 t 2 : T 2 {:t_(0)]t_(2):T_(2)\left.\mathrm{t}_{0}\right] \mathrm{t}_{2}: \mathrm{T}_{2}t0]t2:T2 (15). Applying LET to (14) and (15) produces let z : V [ z : V z:AAV[:}\mathrm{z}: \forall \mathcal{V}\left[\right.z:V[ let z 0 : z 0 : z_(0):\mathrm{z}_{0}:z0: σ 0 σ 0 sigma_(0)\sigma_{0}σ0 in C 1 ] . T 1 C 1 . T 1 {:C_(1)].T_(1)\left.C_{1}\right] . \mathrm{T}_{1}C1].T1 in let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C 2 [ z 0 t 0 ] C 2 z 0 t 0 C_(2)|--[z_(0)|->t_(0)]C_{2} \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right]C2[z0t0] (let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 ) : T 2 t 2 : T 2 {:t_(2)):T_(2)\left.\mathrm{t}_{2}\right): \mathrm{T}_{2}t2):T2 (16). Now, we have
(19) let z 0 : σ 0 ; z : X ¯ 1 [ C 1 ] T 1 in C 2 let z 0 : σ 0 ; z : X ¯ 1 [ let z 0 : σ 0 in C 1 ] T 1 in C 2 let z : X ¯ 1 [ let z 0 : σ 0 in C 1 ] T 1 ; z 0 : σ 0 in C 2 let z : V [ let z 0 : σ 0 in C 1 ] T 1 ; z 0 : σ 0 in C 2 (19)  let  z 0 : σ 0 ; z : X ¯ 1 C 1 T 1  in  C 2  let  z 0 : σ 0 ; z : X ¯ 1  let  z 0 : σ 0  in  C 1 T 1  in  C 2  let  z : X ¯ 1  let  z 0 : σ 0  in  C 1 T 1 ; z 0 : σ 0  in  C 2  let  z : V  let  z 0 : σ 0  in  C 1 T 1 ; z 0 : σ 0  in  C 2 {:(19){:[," let "z_(0):sigma_(0);z:AA bar(X)_(1)[C_(1)]*T_(1)" in "C_(2)],[-=," let "z_(0):sigma_(0);z:AA bar(X)_(1)[" let "z_(0):sigma_(0)" in "C_(1)]*T_(1)" in "C_(2)],[-=," let "z:AA bar(X)_(1)[" let "z_(0):sigma_(0)" in "C_(1)]*T_(1);z_(0):sigma_(0)" in "C_(2)],[⊩," let "z:AAV[" let "z_(0):sigma_(0)" in "C_(1)]*T_(1);z_(0):sigma_(0)" in "C_(2)]:}:}\begin{array}{ll} & \text { let } \mathrm{z}_{0}: \sigma_{0} ; \mathrm{z}: \forall \overline{\mathrm{X}}_{1}\left[C_{1}\right] \cdot \mathrm{T}_{1} \text { in } C_{2} \\ \equiv & \text { let } \mathrm{z}_{0}: \sigma_{0} ; \mathrm{z}: \forall \overline{\mathrm{X}}_{1}\left[\text { let } \mathrm{z}_{0}: \sigma_{0} \text { in } C_{1}\right] \cdot \mathrm{T}_{1} \text { in } C_{2} \\ \equiv & \text { let } \mathrm{z}: \forall \overline{\mathrm{X}}_{1}\left[\text { let } \mathrm{z}_{0}: \sigma_{0} \text { in } C_{1}\right] \cdot \mathrm{T}_{1} ; \mathrm{z}_{0}: \sigma_{0} \text { in } C_{2} \\ \Vdash & \text { let } \mathrm{z}: \forall \mathcal{V}\left[\text { let } \mathrm{z}_{0}: \sigma_{0} \text { in } C_{1}\right] \cdot \mathrm{T}_{1} ; \mathrm{z}_{0}: \sigma_{0} \text { in } C_{2} \tag{19} \end{array}(19) let z0:σ0;z:X¯1[C1]T1 in C2 let z0:σ0;z:X¯1[ let z0:σ0 in C1]T1 in C2 let z:X¯1[ let z0:σ0 in C1]T1;z0:σ0 in C2 let z:V[ let z0:σ0 in C1]T1;z0:σ0 in C2
where (17) follows from (13), (3), and C-LETDup; (18) follows from (12) and C-LetLet; and (19) is by Lemma 1.3.25. Thus, applying Lemma 1.5.2 to (16) yields let z 0 : σ 0 ; z : x ¯ 1 [ C 1 ] . T 1 z 0 : σ 0 ; z : x ¯ 1 C 1 . T 1 z_(0):sigma_(0);z:AA bar(x)_(1)[C_(1)].T_(1)\mathrm{z}_{0}: \sigma_{0} ; \mathrm{z}: \forall \overline{\mathrm{x}}_{1}\left[C_{1}\right] . \mathrm{T}_{1}z0:σ0;z:x¯1[C1].T1 in C 2 [ z 0 t 0 ] C 2 z 0 t 0 C_(2)|--[z_(0)|->t_(0)]C_{2} \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right]C2[z0t0] (let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 ) : T 2 t 2 : T 2 {:t_(2)):T_(2)\left.\mathrm{t}_{2}\right): \mathrm{T}_{2}t2):T2.
  • Case Sub. By the induction hypothesis, by Sub, and by C-InAnd*.
  • Case Exists. The rule's conclusion is x ¯ . C t x ¯ . C t EE bar(x).C|--t\exists \overline{\mathrm{x}} . C \vdash \mathrm{t}x¯.Ct : T. Its premises are C C C|--C \vdashC t : T t : T t:T\mathrm{t}: \mathrm{T}t:T (20) and X ¯ # f t v ( T ) X ¯ # f t v ( T ) bar(X)#ftv(T)\overline{\mathrm{X}} \# \mathrm{ftv}(\mathrm{T})X¯#ftv(T) (21). We may assume, w.l.o.g., X ¯ # f t v ( σ 0 ) X ¯ # f t v σ 0 bar(X)#ftv(sigma_(0))\overline{\mathrm{X}} \# \mathrm{ftv}\left(\sigma_{0}\right)X¯#ftv(σ0) (22). Applying the induction hypothesis to (20) yields let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C [ z 0 t 0 ] t C z 0 t 0 t C|--[z_(0)|->t_(0)]tC \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}C[z0t0]t : T T T\mathrm{T}T, which, by (21) and ExisTs, implies x ¯ x ¯ EE bar(x)\exists \overline{\mathrm{x}}x¯. let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in C [ z 0 t 0 ] t : T C z 0 t 0 t : T C|--[z_(0)|->t_(0)]t:TC \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}: \mathrm{T}C[z0t0]t:T (23). By (22) and C-InEx, (23) is let z 0 : σ 0 z 0 : σ 0 z_(0):sigma_(0)\mathrm{z}_{0}: \sigma_{0}z0:σ0 in x ¯ . C [ z 0 t 0 ] t : T x ¯ . C z 0 t 0 t : T EE bar(x).C|--[z_(0)|->t_(0)]t:T\exists \overline{\mathrm{x}} . C \vdash\left[\mathrm{z}_{0} \mapsto \mathrm{t}_{0}\right] \mathrm{t}: \mathrm{T}x¯.C[z0t0]t:T.
1.7.2 Lemma: let z : X ¯ [ [ [ t 2 : T 2 ] ] ] T 2 z : X ¯ [ [ t 2 : T 2 ] ] T 2 z:AA bar(X)[([[)t_(2):T_(2)(]])]*T_(2)\mathrm{z}: \forall \overline{\mathrm{X}}\left[\llbracket \mathrm{t}_{2}: \mathrm{T}_{2} \rrbracket\right] \cdot \mathrm{T}_{2}z:X¯[[[t2:T2]]]T2 in [ [ t 1 : T 1 ] ] [ [ t 1 : T 1 ] ] [[t_(1):T_(1)]]\llbracket \mathrm{t}_{1}: \mathrm{T}_{1} \rrbracket[[t1:T1]] entails [ [ [ z t 2 ] t 1 : T 1 ] ] [ [ z t 2 t 1 : T 1 ] ] [[[z|->t_(2)]t_(1):T_(1)]]\llbracket\left[\mathrm{z} \mapsto \mathrm{t}_{2}\right] \mathrm{t}_{1}: \mathrm{T}_{1} \rrbracket[[[zt2]t1:T1]].
Before going on, let us give a few definitions and formulate several requirements. First, we must define an initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0, which assigns a type scheme to every constant. A couple of requirements must be made to ensure that Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 is consistent with the semantics of constants, as specified by δ δ rarr"delta"\xrightarrow{\delta}δ. Second, we must extend constraint generation and well-typedness to configurations, as opposed to programs, since reduction operates on configurations.
Last, we must formulate a restriction to tame the interaction between side effects and let-polymorphism, which is unsound if unrestricted.
1.7.3 Definition: Let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 be an environment whose domain is the set of constants Q Q Q\mathcal{Q}Q. We require f t v ( Γ 0 ) = f t v Γ 0 = ftv(Gamma_(0))=O/f t v\left(\Gamma_{0}\right)=\varnothingftv(Γ0)=, fpi ( Γ 0 ) = Γ 0 = (Gamma_(0))=O/\left(\Gamma_{0}\right)=\varnothing(Γ0)=, and Γ 0 Γ 0 EEGamma_(0)-=\exists \Gamma_{0} \equivΓ0 true. We refer to Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 as the initial typing environment.
1.7.4 Definition: Let ref be an isolated, invariant type constructor of signature ***=>***\star \Rightarrow \star. A store type M M MMM is a finite mapping from memory locations to types. We write ref M M MMM for the environment that maps m m mmm to ref M ( m ) ref M ( m ) ref M(m)\operatorname{ref} M(m)refM(m) when m m mmm is in the domain of M M MMM. Assuming dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ) and dom ( M ) dom ( M ) dom(M)\operatorname{dom}(M)dom(M) coincide, the constraint [ [ μ : M ] ] [ [ μ : M ] ] [[mu:M]]\llbracket \mu: M \rrbracket[[μ:M]] is defined as the conjunction of the constraints [ [ μ ( m ) : M ( m ) ] ] [ [ μ ( m ) : M ( m ) ] ] [[mu(m):M(m)]]\llbracket \mu(m): M(m) \rrbracket[[μ(m):M(m)]], where m m mmm ranges over dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ). Under the same assumption, the constraint [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t//mu:T//M]]\llbracket \mathrm{t} / \mu: \mathrm{T} / M \rrbracket[[t/μ:T/M]] is defined as [ [ t : T ] ] [ [ μ : M ] ] [ [ t : T ] ] [ [ μ : M ] ] [[t:T]]^^[[mu:M]]\llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \llbracket \mu: M \rrbracket[[t:T]][[μ:M]]. A configuration t / μ t / μ t//mu\mathrm{t} / \mut/μ is well-typed if and only if there exist a type T T T\mathrm{T}T and a store type M M MMM such that dom ( μ ) = dom ( M ) dom ( μ ) = dom ( M ) dom(mu)=dom(M)\operatorname{dom}(\mu)=\operatorname{dom}(M)dom(μ)=dom(M) and the constraint let Γ 0 ; Γ 0 ; Gamma_(0);\Gamma_{0} ;Γ0; ref M M MMM in [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t//mu:T//M]]\llbracket \mathrm{t} / \mu: \mathrm{T} / M \rrbracket[[t/μ:T/M]] is satisfiable.
The type ref T T T\mathrm{T}T is the type of references (that is, memory locations) that store data of type T. It must be invariant in its parameter, reflecting the fact that references may be read and written.
A store is a complex object: it may contain values that indirectly refer to each other via memory locations. In fact, it is a representation of the graph formed by objects and pointers in memory, which may contain cycles. We rely on store types to deal with such cycles. In the definition of well-typedness, the store type M M MMM imposes a constraint on the contents of the store - the value μ ( m ) μ ( m ) mu(m)\mu(m)μ(m) must have type M ( m ) M ( m ) M(m)M(m)M(m)-but also plays the role of a hypothesis: by placing the constraint [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t//mu:T//M]]\llbracket \mathrm{t} / \mu: \mathrm{T} / M \rrbracket[[t/μ:T/M]] within the context let ref M M MMM in [], we give meaning to free occurrences of memory locations within [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t//mu:T//M]]\llbracket \mathrm{t} / \mu: \mathrm{T} / M \rrbracket[[t/μ:T/M]], and stipulate that it is valid to assume that m m mmm has type M ( m ) M ( m ) M(m)M(m)M(m). In other words, we essentially view the store as a large, mutually recursive binding of locations to values. Since no satisfiable constraint may have a free program identifier (Lemma 1.3.31), every well-typed configuration must be closed. The context let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [] gives meaning to occurrences of constants within [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t//mu:T//M]]\llbracket \mathrm{t} / \mu: \mathrm{T} / M \rrbracket[[t/μ:T/M]].
We now define a relation between configurations that plays a key role in the statement of the subject reduction property. The point of subject reduction is to guarantee that well-typedness is preserved by reduction. However, such a simple statement is too weak to be amenable to inductive proof. Thus, for the purposes of the proof, we must be more specific. To begin, let us consider the simpler case of a pure semantics, that is, a semantics without stores. Then, we must state that if an expression t t ttt has type T T TTT under a certain constraint, then its reduct t t t^(')t^{\prime}t has type T T T\mathrm{T}T under the same constraint. In terms of generated constraints, this statement becomes: let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ t : T ] ] [ [ t : T ] ] [[t^('):T]]\llbracket t^{\prime}: \mathrm{T} \rrbracket[[t:T]].
Let us now return to the general case, where a store is present. Then, the statement of well-typedness for a configuration t / μ t / μ t//mut / \mut/μ involves a store type M M MMM whose domain is that of μ μ mu\muμ. So, the statement of well-typedness for its reduct t / μ t / μ t^(')//mu^(')\mathrm{t}^{\prime} / \mu^{\prime}t/μ must involve a store type M M M^(')M^{\prime}M whose domain is that of μ μ mu^(')\mu^{\prime}μ-which is larger if allocation occurred. The types of existing memory locations must not change: we must request that M M MMM and M M M^(')M^{\prime}M agree on dom ( M ) dom ( M ) dom(M)\operatorname{dom}(M)dom(M), that is, M M M^(')M^{\prime}M must extend M M MMM. Furthermore, the types assigned to new memory locations in dom ( M ) dom ( M ) dom M dom ( M ) dom(M^('))\\dom(M)\operatorname{dom}\left(M^{\prime}\right) \backslash \operatorname{dom}(M)dom(M)dom(M) might involve new type variables, that is, variables that do not appear free in M M MMM or T. We must allow these variables to be hidden-that is, existentially quantified - otherwise the entailment assertion cannot hold. These considerations lead us to the following definition:
1.7.5 Definition: t / μ t / μ t / μ t / μ t//mu⊑t^(')//mu^(')\mathrm{t} / \mu \sqsubseteq \mathrm{t}^{\prime} / \mu^{\prime}t/μt/μ holds if and only if, for every type T T T\mathrm{T}T and for every store type M M MMM such that dom ( μ ) = dom ( M ) dom ( μ ) = dom ( M ) dom(mu)=dom(M)\operatorname{dom}(\mu)=\operatorname{dom}(M)dom(μ)=dom(M), there exist a set of type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ and a store type M M M^(')M^{\prime}M such that Y ¯ # f t v ( T , M ) Y ¯ # f t v ( T , M ) bar(Y)#ftv(T,M)\overline{\mathrm{Y}} \# f t v(\mathrm{~T}, M)Y¯#ftv( T,M) and f t v ( M ) Y ¯ f t v ( M ) f t v M Y ¯ f t v ( M ) ftv(M^('))sube bar(Y)uu ftv(M)f t v\left(M^{\prime}\right) \subseteq \overline{\mathrm{Y}} \cup f t v(M)ftv(M)Y¯ftv(M) and dom ( M ) = dom ( μ ) dom M = dom μ dom(M^('))=dom(mu^('))\operatorname{dom}\left(M^{\prime}\right)=\operatorname{dom}\left(\mu^{\prime}\right)dom(M)=dom(μ) and M M M^(')M^{\prime}M extends M M MMM and
let Γ 0 ; ref M in [ [ t / μ : T / M ] ] Y ¯ .let Γ 0 ; ref M in [ [ t / μ : T / M ] ]  let  Γ 0 ; ref M  in  [ [ t / μ : T / M ] ] Y ¯ .let  Γ 0 ; ref M  in  [ [ t / μ : T / M ] ] {:[" let "Gamma_(0);ref M" in "[[t//mu:T//M]]],[⊩EE bar(Y)".let "Gamma_(0);ref M^(')" in "[[t^(')//mu^('):T//M^(')]]]:}\begin{gathered} \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket \mathrm{t} / \mu: \mathrm{T} / M \rrbracket \\ \Vdash \exists \overline{\mathrm{Y}} \text {.let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in } \llbracket \mathrm{t}^{\prime} / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket \end{gathered} let Γ0;refM in [[t/μ:T/M]]Y¯.let Γ0;refM in [[t/μ:T/M]]
The relation \sqsubseteq is intended to express a connection between a configuration and its reduct. Thus, subject reduction may be stated as: ( ) ( ) ( ) ( ) (longrightarrow)sube(⊑)(\longrightarrow) \subseteq(\sqsubseteq)()(), that is, \sqsubseteq is indeed a conservative description of reduction.
We have introduced an initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 and used it in the definition of well-typedness, but we haven't yet ensured that the type schemes assigned to constants are an adequate description of their semantics. We now formulate two requirements that relate Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with δ δ rarr"delta"\xrightarrow{\delta}δ. They are specializations of the subject reduction and progress properties to configurations that involve an application of a constant. They represent proof obligations that must be discharged when concrete definitions of Q , δ Q , δ Q,rarr"delta"\mathcal{Q}, \xrightarrow{\delta}Q,δ, and Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 are given.
1.7.6 Definition: We require (i) ( δ ) ( ) ( δ ) ( ) (rarr"delta")sube(⊑)(\xrightarrow{\delta}) \subseteq(\sqsubseteq)(δ)(); and (ii) if the configuration c v 1 v k / μ c v 1 v k / μ cv_(1)dotsv_(k)//mu\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k} / \mucv1vk/μ (where k 0 k 0 k >= 0k \geq 0k0 ) is well-typed, then either it is reducible, or c v 1 v k v 1 v k v_(1)dotsv_(k)\mathrm{v}_{1} \ldots \mathrm{v}_{k}v1vk is a value.
The last point that remains to be settled before proving type soundness is the interaction between side effects and let-polymorphism. The following example illustrates the problem:
let r = ref λ z . z in let = ( r := λ z ( z + ^ 1 ^ ) ) in ! true  let  r =  ref  λ z . z  in let  = ( r := λ z ( z + ^ 1 ^ ) )  in  !  true  " let "r=" ref "lambda z.z" in let "_(-)=(r:=lambda z*(z hat(+) hat(1)))" in "!" true "\text { let } r=\text { ref } \lambda z . z \text { in let }{ }_{-}=(r:=\lambda z \cdot(z \hat{+} \hat{1})) \text { in } ! \text { true } let r= ref λz.z in let =(r:=λz(z+^1^)) in ! true 
This expression reduces to true + ^ 1 ^ + ^ 1 ^ hat(+) hat(1)\hat{+} \hat{1}+^1^, so it must not be well-typed. Yet, if natural type schemes are assigned to ref, !, and := := :=:=:= (see Example 1.9.5), then
it is well-typed with respect to the rules given so far, because r r r\mathrm{r}r receives the polymorphic type scheme X X AAX\forall \mathrm{X}X. ref ( X X ) ref ( X X ) ref(XrarrX)\operatorname{ref}(\mathrm{X} \rightarrow \mathrm{X})ref(XX), which allows writing a function of type int rarr\rightarrow int into r r r\mathrm{r}r and reading it back with type bool rarr\rightarrow bool. The problem is that let-polymorphism simulates a textual duplication of the letbound expression ref λ z . z λ z . z lambda z.z\lambda z . zλz.z, while the semantics first reduces it to a value m m mmm, causing a new binding m λ z . z m λ z . z m|->lambda z.zm \mapsto \lambda z . zmλz.z to appear in the store, then duplicates the address m m mmm. The new store binding is not duplicated: both copies of m m mmm refer to the same memory cell. For this reason, generalization is unsound in this case, and must be restricted. Many authors have attempted to come up with a sound type system that accepts all pure programs and remains flexible enough in the presence of side effects (Tofte, 1988; Leroy, 1992). These proposals are often complex, which is why they have been abandoned in favor of an extremely simple syntactic restriction, known as the value restriction (Wright, 1995).
1.7.7 Definition: A program satisfies the value restriction if and only if all subexpressions of the form let z = t 1 z = t 1 z=t_(1)z=t_{1}z=t1 in t 2 t 2 t_(2)t_{2}t2 are in fact of the form let z = v 1 z = v 1 z=v_(1)z=v_{1}z=v1 in t 2 t 2 t_(2)t_{2}t2. In the following, we assume that either all constants have pure semantics, or all programs satisfy the value restriction.
Put slightly differently, the value restriction states that only values may be generalized. This eliminates the problem altogether, since duplicating values does not affect a program's semantics. Note that any program that does not satisfy the value restriction can be turned into one that does and has the same semantics: it suffices to change let z = t 1 z = t 1 z=t_(1)z=t_{1}z=t1 in t 2 t 2 t_(2)t_{2}t2 into ( λ z . t 2 ) t 1 λ z . t 2 t 1 (lambda z.t_(2))t_(1)\left(\lambda z . t_{2}\right) t_{1}(λz.t2)t1 when t 1 t 1 t_(1)t_{1}t1 is not a value. Of course, such a transformation may cause the program to become ill-typed. In other words, the value restriction causes some perfectly safe programs to be rejected. In particular, as stated above, it prevents generalizing applications of the form c v 1 v k c v 1 v k cv_(1)dotsv_(k)\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k}cv1vk, where c c c\mathrm{c}c is a destructor of arity k k kkk. This is excessive, because many destructors have pure semantics; only a few, such as ref, allocate new mutable storage. Furthermore, we use pure destructors to encode numerous language features (Section 1.9). Fortunately, it is easy to relax the restriction to allow generalizing not only values, but also a more general class of nonexpansive expressions, whose syntax guarantees that such expressions cannot allocate new mutable storage (that is, expand the domain of the store). The term nonexpansive was coined by Tofte (1988). Nonexpansive expressions may include applications of the form c t 1 t k c t 1 t k ct_(1)dotst_(k)\mathrm{c} \mathrm{t}_{1} \ldots \mathrm{t}_{k}ct1tk, where c c c\mathrm{c}c is a pure destructor of arity k k kkk and t 1 , , t k t 1 , , t k t_(1),dots,t_(k)t_{1}, \ldots, t_{k}t1,,tk are nonexpansive. Experience shows that this slightly relaxed restriction is acceptable in practice. Some other improvements to the value restriction exist; see e.g. Exercise (Garrigue, 2002). Another frequent limitation of the value restriction are constructor functions, that is, functions that only build values, which are treated as ordinary functions and not as constructors, and their applications are not considered to be
values. For instance, in the expression let f = c v f = c v f=cvf=c vf=cv in let z = f z = f z=fz=fz=f w in t t ttt where c c c\mathrm{c}c is a constructor of arity 2, the partial application c v c v cv\mathrm{c} vcv bound to f f fff is a constructor function (of arity 1 ), but f w f w fwf \mathrm{w}fw is treated as a regular application and cannot be generalized. Technically, the effect of the (strict) value restriction is summarized by the following result.
1.7.8 Lemma: Under the value restriction, the production E ::= E ::= E::=\mathcal{E}::=E::= let z = E z = E z=E\mathrm{z}=\mathcal{E}z=E in t t t\mathrm{t}t may be suppressed from the grammar of evaluation contexts (Figure 1-1) without altering the operational semantics.
We are done with definitions and requirements. We now come to the bulk of the type soundness proof.
1.7.9 THEOREM [SUBJECT REDUCTION]: ( ) ( ) ( ) ( ) (longrightarrow)sube(⊑)(\longrightarrow) \subseteq(\sqsubseteq)()().
Proof: Because longrightarrow\longrightarrow and longrightarrow\longrightarrow are the smallest relations that satisfy the rules of Figure 1-2, it suffices to prove that \sqsubseteq satisfies these rules as well. We remark that if, for every type T , [ [ t : T ] ] [ [ t : T ] ] T , [ [ t : T ] ] [ [ t : T ] ] T,[[t:T]]⊩[[t^('):T]]\mathrm{T}, \llbracket \mathrm{t}: \mathrm{T} \rrbracket \Vdash \llbracket \mathrm{t}^{\prime}: \mathrm{T} \rrbracketT,[[t:T]][[t:T]] holds, then t / μ t / μ t / μ t / μ t//mu⊑t^(')//mu\mathrm{t} / \mu \sqsubseteq \mathrm{t}^{\prime} / \mut/μt/μ holds. (Take Y ¯ = Y ¯ = bar(Y)=O/\overline{\mathrm{Y}}=\varnothingY¯= and M = M M = M M^(')=MM^{\prime}=MM=M and use the fact that entailment is a congruence to check that the conditions of Definition 1.7.5 are met.) We make use of this fact in cases R-BETA and R-LET below.
  • Case R-Beta. We have
[ [ ( λ z . t ) v : T ] ] (1) x . ( [ [ λ z . t : X T ] ] [ [ v : x ] ] ) (2) x . ( let z : X in [ [ t : T ] ] [ [ v : x ] ] ) (3) x . let z : [ [ [ v : x ] ] ] . X in [ [ t : T ] ] (4) [ [ [ z v ] t : T ] ] [ [ ( λ z . t ) v : T ] ] (1) x . ( [ [ λ z . t : X T ] ] [ [ v : x ] ] ) (2) x . (  let  z : X  in  [ [ t : T ] ] [ [ v : x ] ] ) (3) x .  let  z : [ [ [ v : x ] ] ] . X  in  [ [ t : T ] ] (4) [ [ [ z v ] t : T ] ] {:[[[(lambda z.t)v:T]]],[(1)-=EEx.([[lambdaz.t:XrarrT]]^^[[v:x]])],[(2)-=EEx.(" let "z:X" in "[[t:T]]^^[[v:x]])],[(3)-=EEx." let "z:AA O/[[[v:x]]].X" in "[[t:T]]],[(4)⊩[[[z|->v]t:T]]]:}\begin{align*} & \llbracket(\lambda z . t) \mathrm{v}: \mathrm{T} \rrbracket \\ \equiv & \exists \mathrm{x} .(\llbracket \lambda \mathrm{z} . \mathrm{t}: \mathrm{X} \rightarrow \mathrm{T} \rrbracket \wedge \llbracket \mathrm{v}: \mathrm{x} \rrbracket) \tag{1}\\ \equiv & \exists \mathrm{x} .(\text { let } \mathrm{z}: \mathrm{X} \text { in } \llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \llbracket \mathrm{v}: \mathrm{x} \rrbracket) \tag{2}\\ \equiv & \exists \mathrm{x} . \text { let } \mathrm{z}: \forall \varnothing[\llbracket \mathrm{v}: \mathrm{x} \rrbracket] . \mathrm{X} \text { in } \llbracket \mathrm{t}: \mathrm{T} \rrbracket \tag{3}\\ \Vdash & \llbracket[\mathrm{z} \mapsto \mathrm{v}] \mathrm{t}: \mathrm{T} \rrbracket \tag{4} \end{align*}[[(λz.t)v:T]](1)x.([[λz.t:XT]][[v:x]])(2)x.( let z:X in [[t:T]][[v:x]])(3)x. let z:[[[v:x]]].X in [[t:T]](4)[[[zv]t:T]]
where (1) is by definition of constraint generation; (2) is by Lemma 1.6.5; (3) is by C-LetAnd; (4) is by Lemma 1.7.2 and C-Ex*.
  • Case R-LEt. We have
[ [ let z = v in t : T ] ] (1) = let z : x [ [ [ v : x ] ] ] X in [ [ t : T ] ] (2) [ [ [ z v ] t : T ] ] [ [  let  z = v  in  t : T ] ] (1) =  let  z : x [ [ [ v : x ] ] ] X  in  [ [ t : T ] ] (2) [ [ [ z v ] t : T ] ] {:[[[" let "z=v" in "t:T]]],[(1)=" let "z:AAx[[[v:x]]]*X" in "[[t:T]]],[(2)[[[z|->v]t:T]]]:}\begin{align*} & \llbracket \text { let } \mathrm{z}=\mathrm{v} \text { in } \mathrm{t}: \mathrm{T} \rrbracket \\ = & \text { let } \mathrm{z}: \forall \mathrm{x}[\llbracket \mathrm{v}: \mathrm{x} \rrbracket] \cdot \mathrm{X} \text { in } \llbracket \mathrm{t}: \mathrm{T} \rrbracket \tag{1}\\ \mathbb{} & \llbracket[\mathrm{z} \mapsto \mathrm{v}] \mathrm{t}: \mathrm{T} \rrbracket \tag{2} \end{align*}[[ let z=v in t:T]](1)= let z:x[[[v:x]]]X in [[t:T]](2)[[[zv]t:T]]
where (1) is by definition of constraint generation and (2) is by Lemma 1.7.2.
  • Case R-Delta. This case is exactly requirement (i) in Definition 1.7.6.
  • Case R-Extend. Our hypotheses are t / μ t / μ ( 1 ) t / μ t / μ ( 1 ) t//mu⊑t^(')//mu^(')(1)\mathrm{t} / \mu \sqsubseteq \mathrm{t}^{\prime} / \mu^{\prime}(\mathbf{1})t/μt/μ(1) and dom ( μ ) # dom μ # dom(mu^(''))#\operatorname{dom}\left(\mu^{\prime \prime}\right) \#dom(μ)# dom ( μ ) dom μ dom(mu^('))\operatorname{dom}\left(\mu^{\prime}\right)dom(μ) (2) and range( μ ) # dom ( μ μ ) μ # dom μ μ {:mu^(''))#dom(mu^(')\\mu)\left.\mu^{\prime \prime}\right) \# \operatorname{dom}\left(\mu^{\prime} \backslash \mu\right)μ)#dom(μμ) (3). Because dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ) must be a subset of dom ( μ ) dom μ dom(mu^('))\operatorname{dom}\left(\mu^{\prime}\right)dom(μ), it is also disjoint with dom ( μ ) dom μ dom(mu^(''))\operatorname{dom}\left(\mu^{\prime \prime}\right)dom(μ). Our goal is t / μ μ t / μ μ t//mumu^('')⊑\mathrm{t} / \mu \mu^{\prime \prime} \sqsubseteqt/μμ t / μ μ ( 4 ) t / μ μ ( 4 ) t^(')//mu^(')mu^('')(4)\mathrm{t}^{\prime} / \mu^{\prime} \mu^{\prime \prime}(4)t/μμ(4). Thus, let us introduce a type T T T\mathrm{T}T and a store type of domain
    dom ( μ μ ) dom μ μ dom(mumu^(''))\operatorname{dom}\left(\mu \mu^{\prime \prime}\right)dom(μμ), or (equivalently) two store types M M MMM and M M M^('')M^{\prime \prime}M whose domains are respectively dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ) and dom ( μ ) dom μ dom(mu^(''))\operatorname{dom}\left(\mu^{\prime \prime}\right)dom(μ). By (1), there exist type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ and a store type M M M^(')M^{\prime}M such that Y ¯ # f t v ( T , M ) Y ¯ # f t v ( T , M ) bar(Y)#ftv(T,M)\overline{\mathrm{Y}} \# f t v(\mathrm{~T}, M)Y¯#ftv( T,M) (5) and f t v ( M ) Y ¯ f t v ( M ) f t v M Y ¯ f t v ( M ) ftv(M^('))sube bar(Y)uu ftv(M)f t v\left(M^{\prime}\right) \subseteq \overline{\mathrm{Y}} \cup f t v(M)ftv(M)Y¯ftv(M) and dom ( M ) = dom ( μ ) dom M = dom μ dom(M^('))=dom(mu^('))\operatorname{dom}\left(M^{\prime}\right)=\operatorname{dom}\left(\mu^{\prime}\right)dom(M)=dom(μ) and M M M^(')M^{\prime}M extends M M MMM (6) and let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M MMM in [ [ t / μ [ [ t / μ [[t//mu\llbracket \mathrm{t} / \mu[[t/μ : T / M ] ] Y ¯ T / M ] ] Y ¯ T//M]]⊩EE bar(Y)\mathrm{T} / M \rrbracket \Vdash \exists \overline{\mathrm{Y}}T/M]]Y¯.let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M M^(')M^{\prime}M in [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t^(')//mu^('):T//M^(')]]\llbracket \mathrm{t}^{\prime} / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket[[t/μ:T/M]]. We may further require, w.l.o.g. Y ¯ # ftv ( M ) ( 7 ) Y ¯ # ftv M ( 7 ) bar(Y)#ftv(M^(''))(7)\overline{\mathrm{Y}} \# \operatorname{ftv}\left(M^{\prime \prime}\right)(7)Y¯#ftv(M)(7). Let us now add the conjunct let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in [ [ μ : M ] ] [ [ μ : M ] ] [[mu^(''):M^('')]]\llbracket \mu^{\prime \prime}: M^{\prime \prime} \rrbracket[[μ:M]] to each side of this entailment assertion. On the left-hand side, by C-InAnd and by Definition 1.7.4, we obtain let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M MMM in [ [ t / μ μ : T / M M ] ] [ [ t / μ μ : T / M M ] ] [[t//mumu^(''):T//MM^('')]]\llbracket \mathrm{t} / \mu \mu^{\prime \prime}: \mathrm{T} / M M^{\prime \prime} \rrbracket[[t/μμ:T/MM]] (8). On the right-hand side, by (5), (7), C-ExAnd, and C-InAnd, we obtain Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯.let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in (let ref M M M^(')M^{\prime}M in [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t^(')//mu^('):T//M^(')]]^^\llbracket \mathrm{t}^{\prime} / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket \wedge[[t/μ:T/M]] let ref M M MMM in [ [ μ : M ] ] ) [ [ μ : M ] ] {:([[)mu^(''):M^('')(]]))\left.\llbracket \mu^{\prime \prime}: M^{\prime \prime} \rrbracket\right)[[μ:M]]) (9). Now, recall that M M M^(')M^{\prime}M extends M M MMM (6) and, furthermore, (3) implies f p i ( [ [ μ f p i [ [ μ fpi(([[)mu^(''):}f p i\left(\llbracket \mu^{\prime \prime}\right.fpi([[μ : M ) M {:M^(''))\left.M^{\prime \prime}\right)M) # d p i ( M M ) d p i M M dpi(M^(')\\M)d p i\left(M^{\prime} \backslash M\right)dpi(MM) (10). By (10), C-InAnd*, and C-InAnd, (9) is equivalent to Y ¯ Y ¯ EE bar(Y)\exists \bar{Y}Y¯.let Γ 0 ; Γ 0 ; Gamma_(0);\Gamma_{0} ;Γ0; ref M M M^(')M^{\prime}M in ( [ [ t / μ : T / M ] ] [ [ μ : M ] ] ) [ [ t / μ : T / M ] ] [ [ μ : M ] ] (([[)t^(')//mu^('):T//M^(')(]])^^([[)mu^(''):M^('')(]]))\left(\llbracket \mathrm{t}^{\prime} / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket \wedge \llbracket \mu^{\prime \prime}: M^{\prime \prime} \rrbracket\right)([[t/μ:T/M]][[μ:M]]), that is, Y ¯ Y ¯ EE bar(Y)\exists \bar{Y}Y¯.let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M M^(')M^{\prime}M in [ [ t / μ μ : T / M M ] ] [ [ t / μ μ : T / M M ] ] [[t^(')//mu^(')mu^(''):T//M^(')M^('')]]\llbracket \mathrm{t}^{\prime} / \mu^{\prime} \mu^{\prime \prime}: \mathrm{T} / M^{\prime} M^{\prime \prime} \rrbracket[[t/μμ:T/MM]] (11). Thus, we have established that (8) entails (11). Let us now place this entailment assertion within the constraint context let ref M M M^('')M^{\prime \prime}M in \square. On the left-hand side, because f p i ( Γ 0 , M , M ) = f p i Γ 0 , M , M = fpi(Gamma_(0),M,M^(''))=O/f p i\left(\Gamma_{0}, M, M^{\prime \prime}\right)=\varnothingfpi(Γ0,M,M)= and d p i ( M ) d p i ( Γ 0 , M ) dom ( μ ) ( Q dom ( μ ) ) = d p i M d p i Γ 0 , M dom μ ( Q dom ( μ ) ) = dpi(M^(''))nn dpi(Gamma_(0),M)sube dom(mu^(''))nn(Quu dom(mu))=d p i\left(M^{\prime \prime}\right) \cap d p i\left(\Gamma_{0}, M\right) \subseteq \operatorname{dom}\left(\mu^{\prime \prime}\right) \cap(\mathcal{Q} \cup \operatorname{dom}(\mu))=dpi(M)dpi(Γ0,M)dom(μ)(Qdom(μ))= O/\varnothing, C-LetLet applies, yielding let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M M M MM^('')M M^{\prime \prime}MM in [ [ t / μ μ : T / M M ] ] ( 1 2 ) [ [ t / μ μ : T / M M ] ] ( 1 2 ) [[t//mumu^(''):T//MM^('')]](12)\llbracket \mathrm{t} / \mu \mu^{\prime \prime}: \mathrm{T} / M M^{\prime \prime} \rrbracket(\mathbf{1 2})[[t/μμ:T/MM]](12). On the right-hand side, by (7), C-InEx, and by analogous reasoning, we obtain Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯.let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M M M M^(')M^('')M^{\prime} M^{\prime \prime}MM in [ [ t / μ μ : T / M M ] ] [ [ t / μ μ : T / M M ] ] [[t^(')//mu^(')mu^(''):T//M^(')M^('')]]\llbracket \mathrm{t}^{\prime} / \mu^{\prime} \mu^{\prime \prime}: \mathrm{T} / M^{\prime} M^{\prime \prime} \rrbracket[[t/μμ:T/MM]] (13). Thus, (12) entails (13). Given (5), (7), given f t v ( M M ) Y ¯ f t v ( M M ) f t v M M Y ¯ f t v M M ftv(M^(')M^(''))sube bar(Y)uu ftv(MM^(''))f t v\left(M^{\prime} M^{\prime \prime}\right) \subseteq \overline{\mathrm{Y}} \cup f t v\left(M M^{\prime \prime}\right)ftv(MM)Y¯ftv(MM), and given that M M M M M^(')M^('')M^{\prime} M^{\prime \prime}MM extends M M M M MM^('')M M^{\prime \prime}MM, this establishes the goal (4).
  • Case R-Context. The hypothesis is t / μ t / μ t / μ t / μ t//mu⊑t^(')//mu^(')\mathrm{t} / \mu \sqsubseteq \mathrm{t}^{\prime} / \mu^{\prime}t/μt/μ. The goal is E [ t ] / μ E [ t ] / μ E[t]//mu⊑\mathcal{E}[\mathrm{t}] / \mu \sqsubseteqE[t]/μ E [ t ] / μ E t / μ E[t^(')]//mu^(')\mathcal{E}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}E[t]/μ. Because longrightarrow\longrightarrow relates closed configurations only, we may assume that the configuration E [ t ] / μ E [ t ] / μ E[t]//mu\mathcal{E}[\mathrm{t}] / \muE[t]/μ is closed, so the memory locations that appear free within E E E\mathcal{E}E are members of dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ). Let us now reason by induction on the structure of E E E\mathcal{E}E.
Subcase E = [ ] E = [ ] E=[]\mathcal{E}=[]E=[]. The hypothesis and the goal coincide.
Subcase E = E 1 t 1 E = E 1 t 1 E=E_(1)t_(1)\mathcal{E}=\mathcal{E}_{1} \mathrm{t}_{1}E=E1t1. The induction hypothesis is E 1 [ t ] / μ E 1 [ t ] / μ E 1 [ t ] / μ E 1 t / μ E_(1)[t]//mu⊑E_(1)[t^(')]//mu^(')\mathcal{E}_{1}[\mathrm{t}] / \mu \sqsubseteq \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}E1[t]/μE1[t]/μ (1). Let us introduce a type T T T\mathrm{T}T and a store type M M MMM such that dom ( M ) = dom ( μ ) dom ( M ) = dom ( μ ) dom(M)=dom(mu)\operatorname{dom}(M)=\operatorname{dom}(\mu)dom(M)=dom(μ). Consider the constraint let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in [ [ E [ t ] / μ : T / M ] ] ( 2 ) [ [ E [ t ] / μ : T / M ] ] ( 2 ) [[E[t]//mu:T//M]](2)\llbracket \mathcal{E}[\mathrm{t}] / \mu: \mathrm{T} / M \rrbracket(2)[[E[t]/μ:T/M]](2). By definition of constraint generation, C-ExAnd, C-InEx, and C-InAnd, it is equivalent to

X X EEX\exists \mathrm{X}X.(let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in [ [ E 1 [ t ] / μ : X T / M ] ] [ [ E 1 [ t ] / μ : X T / M ] ] [[E_(1)[t]//mu:XrarrT//M]]^^\llbracket \mathcal{E}_{1}[\mathrm{t}] / \mu: \mathrm{X} \rightarrow \mathrm{T} / M \rrbracket \wedge[[E1[t]/μ:XT/M]] let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in [ [ t 1 : X ] ] ) ( 3 ) [ [ t 1 : X ] ] ( 3 ) {:([[)t_(1):X(]]))(3)\left.\llbracket \mathrm{t}_{1}: \mathrm{X} \rrbracket\right)(\mathbf{3})[[t1:X]])(3)

where X f t v ( T , M ) X f t v ( T , M ) X!in ftv(T,M)\mathrm{X} \notin f t v(\mathrm{~T}, M)Xftv( T,M) (4). By (1), there exist type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ and a store type M M M^(')M^{\prime}M such that Y ¯ # f t v ( X , T , M ) Y ¯ # f t v ( X , T , M ) bar(Y)#ftv(X,T,M)\overline{\mathrm{Y}} \# f t v(\mathrm{X}, \mathrm{T}, M)Y¯#ftv(X,T,M) (5) and f t v ( M ) Y ¯ f t v ( M ) f t v M Y ¯ f t v ( M ) ftv(M^('))sube bar(Y)uu ftv(M)f t v\left(M^{\prime}\right) \subseteq \overline{\mathrm{Y}} \cup f t v(M)ftv(M)Y¯ftv(M) (6) and dom ( M ) = dom ( μ ) dom M = dom μ dom(M^('))=dom(mu^('))\operatorname{dom}\left(M^{\prime}\right)=\operatorname{dom}\left(\mu^{\prime}\right)dom(M)=dom(μ) and M M M^(')M^{\prime}M extends M M MMM and (3) entails
X . ( Y ¯ . let Γ 0 ; ref M in [ [ E 1 [ t ] / μ : X T / M ] ] let Γ 0 ; ref M in [ [ t 1 : X ] ] ) ( 7 ) X . Y ¯ .  let  Γ 0 ; ref M  in  [ [ E 1 t / μ : X T / M ] ]  let  Γ 0 ; ref M  in  [ [ t 1 : X ] ] ( 7 ) EEX.(EE bar(Y)." let "Gamma_(0);ref M^(')" in "([[)E_(1)[t^(')]//mu^('):XrarrT//M^(')(]])^^" let "Gamma_(0);ref M" in "([[)t_(1):X(]]))(7)\exists \mathrm{X} .\left(\exists \overline{\mathrm{Y}} . \text { let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in } \llbracket \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}: \mathrm{X} \rightarrow \mathrm{T} / M^{\prime} \rrbracket \wedge \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket \mathrm{t}_{1}: \mathrm{X} \rrbracket\right)(7)X.(Y¯. let Γ0;refM in [[E1[t]/μ:XT/M]] let Γ0;refM in [[t1:X]])(7)
We pointed out earlier that the memory locations that appear free in t 1 t 1 t_(1)t_{1}t1 are members of dom ( M ) dom ( M ) dom(M)\operatorname{dom}(M)dom(M), which implies let ref M M MMM in [ [ t 1 : x ] ] [ [ t 1 : x ] ] [[t_(1):x]]-=\llbracket \mathrm{t}_{1}: \mathrm{x} \rrbracket \equiv[[t1:x]] let ref M M M^(')M^{\prime}M in [ [ t 1 [ [ t 1 [[t_(1)\llbracket \mathrm{t}_{1}[[t1 :
X】 (8). By (5), C-ExAnd, (8), C-InAnd, and by definition of constraint generation, we find that (7) is equivalent to
x Y ¯ .let Γ 0 ; ref M in ( [ [ E 1 [ t ] : X T ] ] [ [ t 1 : X ] ] [ [ μ : M ] ] ) ( 9 ) . x Y ¯ .let  Γ 0 ; ref M  in  [ [ E 1 t : X T ] ] [ [ t 1 : X ] ] [ [ μ : M ] ] ( 9 ) EEx bar(Y)".let "Gamma_(0);ref M^(')" in "(([[)E_(1)[t^(')]:XrarrT(]])^^([[)t_(1):X(]])^^([[)mu^('):M^(')(]]))(9)". "\exists \mathrm{x} \bar{Y} \text {.let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in }\left(\llbracket \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right]: \mathrm{X} \rightarrow \mathrm{T} \rrbracket \wedge \llbracket \mathrm{t}_{1}: \mathrm{X} \rrbracket \wedge \llbracket \mu^{\prime}: M^{\prime} \rrbracket\right)(\mathbf{9}) \text {. }xY¯.let Γ0;refM in ([[E1[t]:XT]][[t1:X]][[μ:M]])(9)
(4), (5) and (6) imply X f t v ( M ) X f t v M X!in ftv(M^('))\mathrm{X} \notin f t v\left(M^{\prime}\right)Xftv(M). Thus, by C-InEx and C-ExAnd, (9) may be written
Y ¯ .let Γ 0 ; ref M in ( x . ( [ [ E 1 [ t ] : X T ] ] [ [ t 1 : X ] ] ) [ [ μ : M ] ] ) , Y ¯ .let  Γ 0 ; ref M  in  x . [ [ E 1 t : X T ] ] [ [ t 1 : X ] ] [ [ μ : M ] ] EE bar(Y)".let "Gamma_(0);ref M^(')" in "(EEx.(([[)E_(1)[t^(')]:XrarrT(]])^^([[)t_(1):X(]]))^^([[)mu^('):M^(')(]]))", "\exists \overline{\mathrm{Y}} \text {.let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in }\left(\exists \mathrm{x} .\left(\llbracket \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right]: \mathrm{X} \rightarrow \mathrm{T} \rrbracket \wedge \llbracket \mathrm{t}_{1}: \mathrm{X} \rrbracket\right) \wedge \llbracket \mu^{\prime}: M^{\prime} \rrbracket\right) \text {, }Y¯.let Γ0;refM in (x.([[E1[t]:XT]][[t1:X]])[[μ:M]])
which, by definition of constraint generation, is
Y ¯ .let Γ 0 ; ref M in [ [ E [ t ] / μ : T / M ] ] ( 1 0 ) . Y ¯ .let  Γ 0 ; ref M  in  [ [ E t / μ : T / M ] ] ( 1 0 ) EE bar(Y)".let "Gamma_(0);ref M^(')" in "[[E[t^(')]//mu^('):T//M^(')]](10)". "\exists \bar{Y} \text {.let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in } \llbracket \mathcal{E}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket(\mathbf{1 0}) \text {. }Y¯.let Γ0;refM in [[E[t]/μ:T/M]](10)
Thus, we have proved that (2) entails (10). By Definition 1.7.5, this establishes E [ t ] / μ E [ t ] / μ E [ t ] / μ E t / μ E[t]//mu⊑E[t^(')]//mu^(')\mathcal{E}[\mathrm{t}] / \mu \sqsubseteq \mathcal{E}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}E[t]/μE[t]/μ.
Subcase E = v E 1 E = v E 1 E=vE_(1)\mathcal{E}=\mathrm{v} \mathcal{E}_{1}E=vE1. Analogous to the previous subcase.
Subcase E = E = E=\mathcal{E}=E= let z = E 1 z = E 1 z=E_(1)\mathrm{z}=\mathcal{E}_{1}z=E1 in t 1 t 1 t_(1)\mathrm{t}_{1}t1. The induction hypothesis is E 1 [ t ] / μ E 1 [ t ] / μ E_(1)[t]//mu⊑\mathcal{E}_{1}[\mathrm{t}] / \mu \sqsubseteqE1[t]/μ E 1 [ t ] / μ ( 1 ) E 1 t / μ ( 1 ) E_(1)[t^(')]//mu^(')(1)\mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}(\mathbf{1})E1[t]/μ(1). This subcase is particularly interesting, because it is where let-polymorphism and side effects interact. In the previous two subcases, we relied on the fact that the Y ¯ Y ¯ EE bar(Y)\exists \bar{Y}Y¯ quantifier, which hides the types of the memory cells created by the reduction step, commutes with the connectives EE\exists and ^^\wedge introduced by application contexts. However, it does not in general (left)commute with the let connective (Example 1.3.28). Fortunately, under the value restriction, this subcase never arises (Lemma 1.7.8). By Definition 1.7.7, this subcase may arise only if all constants have pure semantics, which implies μ = μ = μ = μ = mu=mu^(')=O/\mu=\mu^{\prime}=\varnothingμ=μ=. Then, we have
let Γ 0 in [ [ E [ t ] : T ] ] (2) = let Γ 0 ; z : X [ [ [ E 1 [ t ] : X ] ] ] . X in [ [ t 1 : T ] ] (3) let Γ 0 ; z : X [ let Γ 0 in [ [ E 1 [ t ] : x ] ] ] . X in [ [ t 1 : T ] ] (4) let Γ 0 ; z : X [ let Γ 0 in [ [ E 1 [ t ] : X ] ] ] . X in [ [ t 1 : T ] ] (5) let Γ 0 in [ [ E [ t ] : T ] ]  let  Γ 0  in  [ [ E [ t ] : T ] ] (2) =  let  Γ 0 ; z : X [ [ E 1 [ t ] : X ] ] . X  in  [ [ t 1 : T ] ] (3)  let  Γ 0 ; z : X  let  Γ 0  in  [ [ E 1 [ t ] : x ] ] . X  in  [ [ t 1 : T ] ] (4)  let  Γ 0 ; z : X  let  Γ 0  in  [ [ E 1 t : X ] ] . X  in  [ [ t 1 : T ] ] (5)  let  Γ 0  in  [ [ E t : T ] ] {:[" let "Gamma_(0)" in "[[E[t]:T]]],[(2)=" let "Gamma_(0);z:AAX[([[)E_(1)[t]:X(]])].X" in "[[t_(1):T]]],[(3)-=" let "Gamma_(0);z:AAX[" let "Gamma_(0)" in "([[)E_(1)[t]:x(]])].X" in "[[t_(1):T]]],[(4)⊩" let "Gamma_(0);z:AAX[" let "Gamma_(0)" in "([[)E_(1)[t^(')]:X(]])].X" in "[[t_(1):T]]],[(5)-=" let "Gamma_(0)" in "[[E[t^(')]:T]]]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \llbracket \mathcal{E}[\mathrm{t}]: \mathrm{T} \rrbracket \\ = & \text { let } \Gamma_{0} ; \mathrm{z}: \forall \mathrm{X}\left[\llbracket \mathcal{E}_{1}[\mathrm{t}]: \mathrm{X} \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket \tag{2}\\ \equiv & \text { let } \Gamma_{0} ; \mathrm{z}: \forall \mathrm{X}\left[\text { let } \Gamma_{0} \text { in } \llbracket \mathcal{E}_{1}[\mathrm{t}]: \mathrm{x} \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket \tag{3}\\ \Vdash & \text { let } \Gamma_{0} ; \mathrm{z}: \forall \mathrm{X}\left[\text { let } \Gamma_{0} \text { in } \llbracket \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right]: \mathrm{X} \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket \tag{4}\\ \equiv & \text { let } \Gamma_{0} \text { in } \llbracket \mathcal{E}\left[\mathrm{t}^{\prime}\right]: \mathrm{T} \rrbracket \tag{5} \end{align*} let Γ0 in [[E[t]:T]](2)= let Γ0;z:X[[[E1[t]:X]]].X in [[t1:T]](3) let Γ0;z:X[ let Γ0 in [[E1[t]:x]]].X in [[t1:T]](4) let Γ0;z:X[ let Γ0 in [[E1[t]:X]]].X in [[t1:T]](5) let Γ0 in [[E[t]:T]]
where (2) is by definition of constraint generation; (3) follows from f t v ( Γ 0 ) = f t v Γ 0 = ftv(Gamma_(0))=f t v\left(\Gamma_{0}\right)=ftv(Γ0)= f p i ( Γ 0 ) = f p i Γ 0 = fpi(Gamma_(0))=O/f p i\left(\Gamma_{0}\right)=\varnothingfpi(Γ0)= and C-LETDup; (4) follows from (1), specialized to the case of a pure semantics; and (5) is obtained by performing these steps in reverse.
1.7.10 Exercise [Recommended, *********\star \star \star ]: Try to carry out the last subcase of the above proof in the case of an impure semantics and in the absence of the value restriction. Find out why it fails. Show that it succeeds if Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ is assumed to be empty. Use this fact to prove that generalization is still safe when restricted to nonexpansive expressions, provided (i) evaluating a nonexpansive expression
cannot cause new memory cells to be allocated, (ii) nonexpansive expressions are stable by substitution of values for variables, and (iii) nonexpansive expressions are preserved by reduction.
Subject reduction ensures that well-typedness is preserved by reduction.
1.7.11 Lemma: Let t / μ t / μ t / μ t / μ t//mu longrightarrowt^(')//mu^(')t / \mu \longrightarrow t^{\prime} / \mu^{\prime}t/μt/μ. If t / μ t / μ t//mut / \mut/μ is well-typed, then so is t / μ t / μ t^(')//mu^(')t^{\prime} / \mu^{\prime}t/μ.
Proof: Assume t / μ t / μ t / μ t / μ t//mu longrightarrowt^(')//mu^(')t / \mu \longrightarrow t^{\prime} / \mu^{\prime}t/μt/μ (1) and t / μ t / μ t//mut / \mut/μ is well-typed (2). By (2) and Definition 1.7.4, there exist a type T T T\mathrm{T}T and a store type M M MMM such that dom ( μ ) = dom ( μ ) = dom(mu)=\operatorname{dom}(\mu)=dom(μ)= dom ( M ) dom ( M ) dom(M)\operatorname{dom}(M)dom(M) and the constraint let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in [ [ t / μ : T / M ] ] ( 3 ) [ [ t / μ : T / M ] ] ( 3 ) [[t//mu:T//M]](3)\llbracket \mathrm{t} / \mu: \mathrm{T} / M \rrbracket(3)[[t/μ:T/M]](3) is satisfiable. By Theorem 1.7.9 and Definition 1.7.5, (1) implies that there exist a set of type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ and a store type M M M^(')M^{\prime}M such that dom ( M ) = dom ( μ ) ( 4 ) dom M = dom μ ( 4 ) dom(M^('))=dom(mu^('))(4)\operatorname{dom}\left(M^{\prime}\right)=\operatorname{dom}\left(\mu^{\prime}\right)(\mathbf{4})dom(M)=dom(μ)(4) and the constraint (3) entails Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯.let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M M^(')M^{\prime}M in [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t^(')//mu^('):T//M^(')]]\llbracket \mathrm{t}^{\prime} / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket[[t/μ:T/M]] (5). Because (3) is satisfiable, so is (5), which implies that let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M M^(')M^{\prime}M in [ [ t / μ : T / M ] ] [ [ t / μ : T / M ] ] [[t^(')//mu^('):T//M^(')]]\llbracket \mathrm{t}^{\prime} / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket[[t/μ:T/M]] is satisfiable (6). By (4) and (6) and Definition 1.7.4, t / μ t / μ t^(')//mu^(')t^{\prime} / \mu^{\prime}t/μ is well-typed.
Let us now establish the progress property.
1.7.12 Lemma: If t 1 t 2 t 1 t 2 t_(1)t_(2)t_{1} t_{2}t1t2 is well-typed, then t 1 / μ t 1 / μ t_(1)//mut_{1} / \mut1/μ and t 2 / μ t 2 / μ t_(2)//mut_{2} / \mut2/μ are well-typed. If let z = z = z=z=z= t 1 t 1 t_(1)t_{1}t1 in t 2 / μ t 2 / μ t_(2)//mut_{2} / \mut2/μ is well-typed, then t 1 / μ t 1 / μ t_(1)//mut_{1} / \mut1/μ is well-typed.
1.7.13 Theorem [Progress]: If t / μ t / μ t//mut / \mut/μ is well-typed, then either it is reducible, or t t ttt is a value.
Proof: The proof is by induction on the structure of t t ttt.
  • Case t = z t = z t=z\mathrm{t}=\mathrm{z}t=z. Well-typed configurations are closed: this case cannot occur.
  • Case t = m t = m t=m\mathrm{t}=mt=m. t t t\mathrm{t}t is a value.
  • Case t = c t = c t=c\mathrm{t}=\mathrm{c}t=c. By requirement (ii) of Definition 1.7.6.
@\circ Case t = λ t = λ t=lambda\mathrm{t}=\lambdat=λ z.t t 1 . t t 1 . t t_(1).t\mathrm{t}_{1} . \mathrm{t}t1.t is a value.
  • Case t = t 1 t 2 t = t 1 t 2 t=t_(1)t_(2)\mathrm{t}=\mathrm{t}_{1} \mathrm{t}_{2}t=t1t2. By Lemma 1.7.12, t 1 / μ t 1 / μ t_(1)//mu\mathrm{t}_{1} / \mut1/μ is well-typed. By the induction hypothesis, either it is reducible, or t 1 t 1 t_(1)\mathrm{t}_{1}t1 is a value. If the former, by R R R\mathrm{R}R-CONTEXT and because every context of the form E t 2 E t 2 Et_(2)\mathcal{E} t_{2}Et2 is an evaluation context, the configuration t / μ t / μ t//mut / \mut/μ is reducible as well. Thus, let us assume t 1 t 1 t_(1)t_{1}t1 is a value. By Lemma 1.7.12, t 2 / μ t 2 / μ t_(2)//mu\mathrm{t}_{2} / \mut2/μ is well-typed. By the induction hypothesis, either it is reducible, or t 2 t 2 t_(2)t_{2}t2 is a value. If the former, by R R R\mathrm{R}R-CONTEXT and because every context of the form t 1 E t 1 E t_(1)Et_{1} \mathcal{E}t1E - where t 1 t 1 t_(1)t_{1}t1 is a value - is an evaluation context, the configuration t / μ t / μ t//mut / \mut/μ is reducible as well. Thus, let us assume t 2 t 2 t_(2)t_{2}t2 is a value. Let us now reason by cases on the structure of t 1 t 1 t_(1)t_{1}t1.
Subcase t 1 = z t 1 = z t_(1)=z\mathrm{t}_{1}=\mathrm{z}t1=z. Again, this subcase cannot occur.
Subcase t 1 = m t 1 = m t_(1)=mt_{1}=mt1=m. Because t / μ t / μ t//mut / \mut/μ is well-typed, a constraint of the form let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M MMM in ( x . ( m X T [ [ t 2 : x ] ] ) [ [ μ : M ] ] ) x . m X T [ [ t 2 : x ] ] [ [ μ : M ] ] (EEx.(m-<=XrarrT^^([[)t_(2):x(]]))^^([[)mu:M(]]))\left(\exists \mathrm{x} .\left(m \preceq \mathrm{X} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{t}_{2}: \mathrm{x} \rrbracket\right) \wedge \llbracket \mu: M \rrbracket\right)(x.(mXT[[t2:x]])[[μ:M]]) must be satisfiable. This implies that m m mmm is a member of dom ( M ) dom ( M ) dom(M)\operatorname{dom}(M)dom(M) and that the constraint
ref M ( m ) X T M ( m ) X T M(m) <= XrarrTM(m) \leq \mathrm{X} \rightarrow \mathrm{T}M(m)XT is satisfiable. Because the type constructors ref and rarr\rightarrow are incompatible, this is a contradiction. So, this subcase cannot occur.
Subcase t 1 = λ z t 1 t 1 = λ z t 1 t_(1)=lambdazt_(1)^(')\mathrm{t}_{1}=\lambda \mathrm{z} \mathrm{t}_{1}^{\prime}t1=λzt1. By R-BETA, t / μ t / μ t//mu\mathrm{t} / \mut/μ is reducible.
Subcase t 1 = c v 1 v k t 1 = c v 1 v k t_(1)=cv_(1)dotsv_(k)\mathrm{t}_{1}=\mathrm{cv}_{1} \ldots \mathrm{v}_{k}t1=cv1vk. Then, t t t\mathrm{t}t is of the form c 1 v k + 1 c 1 v k + 1 c_(1)dotsv_(k+1)\mathrm{c}_{1} \ldots \mathrm{v}_{k+1}c1vk+1. The result follows by requirement (ii) of Definition 1.7.6.
  • Case t = t = t=\mathrm{t}=t= let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 t 2 t_(2)\mathrm{t}_{2}t2. By Lemma 1.7.12, t 1 / μ t 1 / μ t_(1)//mu\mathrm{t}_{1} / \mut1/μ is well-typed. By the induction hypothesis, either t 1 / μ t 1 / μ t_(1)//mut_{1} / \mut1/μ is reducible, or t 1 t 1 t_(1)t_{1}t1 is a value. If the former, by R R R\mathrm{R}R-CONTEXT and because every context of the form let z = E z = E z=E\mathrm{z}=\mathcal{E}z=E in t 2 t 2 t_(2)\mathrm{t}_{2}t2 is an evaluation context, the configuration t / μ t / μ t//mut / \mut/μ is reducible as well. If the latter, then t / μ t / μ t//mut / \mut/μ is reducible by R R R\mathrm{R}R-LET.
We may now conclude:
1.7.14 Theorem [Type Soundness]: Well-typed source programs do not go wrong.
Proof: We say that a source program t t ttt is well-typed if and only if the configuration t / t / t//O/t / \varnothingt/ is well-typed, that is, if and only if X X EEX\exists \mathrm{X}X.let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ t : x ] ] [ [ t : x ] ] [[t:x]]-=\llbracket t: \mathrm{x} \rrbracket \equiv[[t:x]] true holds. By Lemma 1.7.11, all reducts of t / t / t//O/t / \varnothingt/ are well-typed. By Theorem 1.7.13, none is stuck.
Let us recall that this result holds only if the requirements of Definition 1.7.6 are met. In other words, some proof obligations remain to be discharged when concrete definitions of Q , δ Q , δ Q,rarr"delta"\mathcal{Q}, \xrightarrow{\delta}Q,δ, and Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 are given. This is illustrated by several examples in the next section.

1.8 Constraint solving

We have introduced a parameterized constraint language, given equivalence laws that describe the interaction between its logical connectives, and exploited them to prove theorems about type inference and type soundness, which are valid independently of the nature of primitive constraints- the socalled predicate applications. However, there would be little point in proposing a parameterized constraint solver, because much of the difficulty of designing an efficient constraint solver precisely lies in the treatment of primitive constraints and in its interaction with let-polymorphism. For this reason, in this section, we focus on constraint solving in the setting of an equality-only free tree model. Thus, the constraint solver developed here allows performing type inference for H M ( = ) H M ( = ) HM(=)\mathrm{HM}(=)HM(=) (that is, for Damas and Milner's type system) and for its extension with recursive types. Of course, some of its mechanisms may be useful in other settings. Other constraint solvers used in program analysis or type inference are described e.g. in (Aiken and Wimmers, 1992; Niehren,
Müller, and Podelski, 1997; Fähndrich, 1999; Melski and Reps, 2000; Müller, Niehren, and Treinen, 2001; Pottier, 2001b; Nielson, Nielson, and Seidl, 2002; McAllester, 2002, 2003).
We begin with a rule-based presentation of a standard, efficient first-order unification algorithm. This yields a constraint solver for a subset of the constraint language, deprived of type scheme introduction and instantiation forms. On top of it, we build a full constraint solver, which corresponds to the code that accompanies this chapter.

Unification

Unification is the process of solving equations between terms. We now present a unification algorithm due to Huet (1976) as a (nondeterministic) system of constraint rewriting rules. The specification is almost the same in the case of finite and regular tree models: only one rule, which implements the occurs check, must be removed in the latter case. In other words, the algorithm works with possibly cyclic terms, and does not rely in an essential way on the occurs check. In order to accurately reflect the behavior of the actual algorithm, which relies on a union-find data structure (Tarjan, 1975), we modify the syntax of constraints by replacing equations with multi-equations. A multi-equation is an equation that involves an arbitrary number of types, as opposed to exactly two.
1.8.1 Definition: Let there be, for every kind κ κ kappa\kappaκ and for every n 1 n 1 n >= 1n \geq 1n1, a predicate κ n κ n _(kappa)^(n){ }_{\kappa}^{n}κn, of signature κ n κ n kappa^(n)=>*\kappa^{n} \Rightarrow \cdotκn, whose interpretation is ( n n nnn-ary) equality. The predicate constraint = κ n T 1 T n = κ n T 1 T n =_(kappa)^(n)T_(1)dotsT_(n)={ }_{\kappa}^{n} \mathrm{~T}_{1} \ldots \mathrm{T}_{n}=κn T1Tn is written T 1 = = T n T 1 = = T n T_(1)=dots=T_(n)\mathrm{T}_{1}=\ldots=\mathrm{T}_{n}T1==Tn, and called a multi-equation. We consider the constraint true as a multi-equation of length 0 . In the following, we identify multi-equations up to permutations of their members, so a multi-equation ϵ ϵ epsilon\epsilonϵ of kind κ κ kappa\kappaκ may be viewed as a finite multiset of types of kind κ κ kappa\kappaκ. We write ϵ = ϵ ϵ = ϵ epsilon=epsilon^(')\epsilon=\epsilon^{\prime}ϵ=ϵ for the multi-equation obtained by concatenating ϵ ϵ epsilon\epsilonϵ and ϵ ϵ epsilon^(')\epsilon^{\prime}ϵ.
Thus, we are interested in the following subset of the constraint language:
U ::= true false | ϵ | U U X ¯ . U U ::=  true   false  | ϵ | U U X ¯ . U U::=" true "∣" false "|epsilon|U^^U∣EE bar(X).UU::=\text { true } \mid \text { false }|\epsilon| U \wedge U \mid \exists \overline{\mathrm{X}} . UU::= true  false |ϵ|UUX¯.U
Equations are replaced with multi-equations; no other predicates are available. Type scheme introduction and instantiation forms are absent.
1.8.2 Definition: A multi-equation is standard if and only if its variable members are distinct and it has at most one nonvariable member. A constraint U U UUU is standard if and only if every multi-equation inside U U UUU is standard and every variable that occurs (free or bound) in U U UUU is a member of at most one multiequation inside U U UUU.
A union-find algorithm maintains equivalence classes (that is, disjoint sets) of variables, and associates, with each class, a descriptor, which in our case is either absent or a nonvariable term. Thus, a standard constraint represents a state of the union-find algorithm. A constraint that is not standard may be viewed as a superposition of a state of the union-find algorithm, on the one hand, and of control information, on the other hand. For instance, a multi-equation of the form ϵ = T 1 = T 2 ϵ = T 1 = T 2 epsilon=T_(1)=T_(2)\epsilon=\mathrm{T}_{1}=\mathrm{T}_{2}ϵ=T1=T2, where T 1 T 1 T_(1)\mathrm{T}_{1}T1 and T 2 T 2 T_(2)\mathrm{T}_{2}T2 are nonvariable terms, may be viewed, roughly speaking, as the equivalence class ϵ = T 1 ϵ = T 1 epsilon=T_(1)\epsilon=\mathrm{T}_{1}ϵ=T1, together with a pending request to solve T 1 = T 2 T 1 = T 2 T_(1)=T_(2)\mathrm{T}_{1}=\mathrm{T}_{2}T1=T2 and to update the class's descriptor accordingly. Because multi-equations encode both state and control, our specification of unification is rather high-level. It would be possible to give a lower-level description, where state (standard conjunctions of multiequations) and control (pending binary equations) are distinguished.
1.8.3 Definition: Let U U UUU be a conjunction of multi-equations. Y Y Y\mathrm{Y}Y is dominated by X X X\mathrm{X}X with respect to U U UUU (written: Y U X Y U X Y-<_(U)X\mathrm{Y} \prec_{U} \mathrm{X}YUX ) if and only if U U UUU contains a conjunct of the form X = F T = ϵ X = F T = ϵ X=F vec(T)=epsilon\mathrm{X}=F \overrightarrow{\mathrm{T}}=\epsilonX=FT=ϵ, where Y f t v ( T ¯ ) . U Y f t v ( T ¯ ) . U Yin ftv( bar(T)).U\mathrm{Y} \in f t v(\overline{\mathrm{T}}) . UYftv(T¯).U is cyclic if and only if the graph of U U -<_(U)\prec_{U}U exhibits a cycle.
The specification of the unification algorithm consists of a set of constraint rewriting rules, given in Figure 1-11. Rewriting is performed modulo α α alpha\alphaα-conversion, modulo permutations of the members of a multi-equation, modulo commutativity and associativity of conjunction, and under an arbitrary context. The specification is nondeterministic: several rule instances may be simultaneously applicable.
S-ExAnd is a directed version of C-ExAnd, whose effect is to float up all existential quantifiers. In the process, all multi-equations become part of a single conjunction, possibly causing rules whose left-hand side is a conjunction of multi-equations, namely S-FUSE and S-CYCLE, to become applicable. S-FUSE identifies two multi-equations that share a common variable X X X\mathrm{X}X, and fuses them. The new multi-equation is not necessarily standard, even if the two original multi-equations were. Indeed, it may have repeated variables or contain two nonvariable terms. The purpose of the next few rules, whose lefthand side consists of a single multi-equation, is to deal with these situations. S-STUTTER eliminates redundant variables. It only deals with variables, as opposed to terms of arbitrary size, so as to have constant time cost. The comparison of nonvariable terms is implemented by S-DECOMPOSE and SClash. S-Decompose decomposes an equation between two terms whose head symbols match. It produces a conjunction of equations between their subterms, namely X = T X = T vec(X)= vec(T)\overrightarrow{\mathrm{X}}=\overrightarrow{\mathrm{T}}X=T. Only one of the two terms remains in the original multi-equation, which may thus become standard. The terms X X vec(X)\overrightarrow{\mathrm{X}}X are copiedthere are two occurrences of X X vec(X)\overrightarrow{\mathrm{X}}X on the right-hand side. For this reason, we
( X ¯ U 1 ) U 2 x ¯ ( U 1 U 2 ) if x ¯ # ftv ( U 2 ) X = ϵ X = ϵ X = ϵ = ϵ X = X = ϵ X = ϵ F x = F T = ϵ x = T F x = ϵ F T 1 T i T n = ϵ X . ( X = T i F T 1 X T n = ϵ ) if T i V X f t v ( T 1 , , T n , ϵ ) F T = F T = ϵ false if F F T true if T V U true U U false if the model is syntactic and U is cyclic U [ false ] false if U [ ] X ¯ U 1 U 2 x ¯ U 1 U 2  if  x ¯ # ftv U 2 X = ϵ X = ϵ X = ϵ = ϵ X = X = ϵ X = ϵ F x = F T = ϵ x = T F x = ϵ F T 1 T i T n = ϵ X . X = T i F T 1 X T n = ϵ  if  T i V X f t v T 1 , , T n , ϵ F T = F T = ϵ  false   if  F F T  true   if  T V U  true  U U  false   if the model is syntactic and  U  is cyclic  U [  false  ]  false   if  U [ ] {:[(EE bar(X)*U_(1))^^U_(2)quad rarrquad EE bar(x)*(U_(1)^^U_(2))],[" if " bar(x)#ftv(U_(2))],[X=epsilon^^X=epsilon^(')quad rarrquadX=epsilon=epsilon^(')],[X=X=epsilonquad rarrquadX=epsilon],[F vec(x)=F vec(T)=epsilonquad rarrquad vec(x)= vec(T)^^F vec(x)=epsilon],[FT_(1)dotsT_(i)dotsT_(n)=epsilonquad rarrquad EEX.(X=T_(i)^^FT_(1)dotsXdotsT_(n)=epsilon)],[" if "T_(i)!inV^^X!in ftv(T_(1),dots,T_(n),epsilon)],[F vec(T)=F^(') vec(T)^(')=epsilonquad rarrquad" false "],[" if "F!=F^(')],[Trarr" true "],[" if "T!inV],[U^^" true "rarr U],[U rarr" false "],[" if the model is syntactic and "U" is cyclic "],[U[" false "]rarr" false "],[" if "U!=[]]:}\begin{aligned} & \left(\exists \overline{\mathrm{X}} \cdot U_{1}\right) \wedge U_{2} \quad \rightarrow \quad \exists \overline{\mathrm{x}} \cdot\left(U_{1} \wedge U_{2}\right) \\ & \text { if } \overline{\mathrm{x}} \# \operatorname{ftv}\left(U_{2}\right) \\ & \mathrm{X}=\epsilon \wedge \mathrm{X}=\epsilon^{\prime} \quad \rightarrow \quad \mathrm{X}=\epsilon=\epsilon^{\prime} \\ & \mathrm{X}=\mathrm{X}=\epsilon \quad \rightarrow \quad \mathrm{X}=\epsilon \\ & F \overrightarrow{\mathrm{x}}=F \overrightarrow{\mathrm{T}}=\epsilon \quad \rightarrow \quad \overrightarrow{\mathrm{x}}=\overrightarrow{\mathrm{T}} \wedge F \overrightarrow{\mathrm{x}}=\epsilon \\ & F \mathrm{~T}_{1} \ldots \mathrm{T}_{i} \ldots \mathrm{T}_{n}=\epsilon \quad \rightarrow \quad \exists \mathrm{X} .\left(\mathrm{X}=\mathrm{T}_{i} \wedge F \mathrm{~T}_{1} \ldots \mathrm{X} \ldots \mathrm{T}_{n}=\epsilon\right) \\ & \text { if } \mathrm{T}_{i} \notin \mathcal{V} \wedge \mathrm{X} \notin f t v\left(\mathrm{~T}_{1}, \ldots, \mathrm{T}_{n}, \epsilon\right) \\ & F \overrightarrow{\mathrm{T}}=F^{\prime} \overrightarrow{\mathrm{T}}^{\prime}=\epsilon \quad \rightarrow \quad \text { false } \\ & \text { if } F \neq F^{\prime} \\ & \mathrm{T} \rightarrow \text { true } \\ & \text { if } \mathrm{T} \notin \mathcal{V} \\ & U \wedge \text { true } \rightarrow U \\ & U \rightarrow \text { false } \\ & \text { if the model is syntactic and } U \text { is cyclic } \\ & \mathcal{U}[\text { false }] \rightarrow \text { false } \\ & \text { if } \mathcal{U} \neq[] \end{aligned}(X¯U1)U2x¯(U1U2) if x¯#ftv(U2)X=ϵX=ϵX=ϵ=ϵX=X=ϵX=ϵFx=FT=ϵx=TFx=ϵF T1TiTn=ϵX.(X=TiF T1XTn=ϵ) if TiVXftv( T1,,Tn,ϵ)FT=FT=ϵ false  if FFT true  if TVU true UU false  if the model is syntactic and U is cyclic U[ false ] false  if U[]

Figure 1-11: Unification

require them to be type variables, as opposed to terms of arbitrary size. (We slightly abuse notation by using x x vec(x)\overrightarrow{\mathrm{x}}x to denote a vector of type variables whose elements are not necessarily distinct.) By doing so, we allow explicitly reasoning about sharing: since a variable represents a pointer to an equivalence class, we explicitly specify that only pointers, not whole terms, are copied. As a result of this decision, S S S\mathrm{S}S-DECOMPOSE is not applicable when both terms at hand have a nonvariable subterm. S-NAME- 1 remedies this problem by introducing a fresh variable that stands for one such subterm. When repeatedly applied, S-NAME-1 yields a unification problem composed of so-called small terms only - that is, where sharing has been made fully explicit. S-CLASH complements S-DECOmpose by dealing with the case where two terms with different head symbols are equated; in a free tree model, such an equation is false, so failure is signaled. S-SingLE and S-TRUE suppress multi-equations of size 1 and 0 , respectively, which are tautologies. S-SINGLE is restricted to nonvariable terms so as not to break the property that every variable is a member
of exactly one multi-equation (Definition 1.8.2). S-CYCLE is the occurs check: that is, it signals failure if the constraint is cyclic. It is applicable only in the case of syntactic unification, that is, when ground types are finite trees. It is a global check: its left-hand side is an entire conjunction of multi-equations.
S-FAIL propagates failure; U U U\mathcal{U}U ranges over unification constraint contexts.
The constraint rewriting system in Figure 1-11 enjoys the following properties. First, rewriting is strongly normalizing, so the rules define a (nondeterministic) algorithm. Second, rewriting is meaning-preserving. Third, every normal form is either false or of the form X ¯ . U X ¯ . U EE bar(X).U\exists \overline{\mathrm{X}} . UX¯.U, where U U UUU is satisfiable. The latter two properties indicate that the algorithm is indeed a constraint solver.
1.8.4 Lemma: The rewriting system rarr\rightarrow is strongly normalizing.
1.8.5 Lemma: U 1 U 2 U 1 U 2 U_(1)rarrU_(2)U_{1} \rightarrow U_{2}U1U2 implies U 1 U 2 U 1 U 2 U_(1)-=U_(2)U_{1} \equiv U_{2}U1U2.
1.8.6 Lemma: Every normal form is either false or of the form X [ U ] X [ U ] X[U]\mathcal{X}[U]X[U], where X X X\mathcal{X}X is an existential constraint context, U U UUU is a standard conjunction of multi-equations and, if the model is syntactic, U U UUU is acyclic. These conditions imply that U U UUU is satisfiable.

A constraint solver

On top of the unification algorithm, we now define a constraint solver. Its specification is independent of the rules and strategy employed by the unification algorithm. However, the structure of the unification algorithm's normal forms, as well as the logical properties of multi-equations, are exploited when performing generalization, that is, when creating and simplifying type schemes. Like the unification algorithm, the constraint solver is specified in terms of a reduction system. However, the objects that are subject to rewriting are not just constraints: they have more complex structure. Working with such richer states allows distinguishing the solver's external language - namely, the full constraint language, which is used to express the problem that one wishes to solve - and an internal language, introduced below, which is used to describe the solver's private data structures. In the following, C C CCC and D D DDD range over external constraints, that is, constraints that were part of the solver's input. External constraints are to be viewed as abstract syntax trees, subject to no implicit laws other than α α alpha\alphaα-conversion. As a simplifying assumption, we require external constraints not to contain any occurrence of false-otherwise the problem at hand is clearly false. Internal data structures include unification constraints U U UUU, as previously studied, and stacks. Stacks form a subset of constraint contexts, defined on page 24. Their syntax is as follows:
S ::= | S [ [ ] C ] | S [ x ¯ . [ ] ] S [ let x : x ¯ [ [ ] ] . T in C ] S [ let x : σ in []] S ::= | S [ [ ] C ] | S [ x ¯ . [ ] ] S [  let  x : x ¯ [ [ ] ] . T  in  C ] S [  let  x : σ  in []]  S::=◻|S[[]^^C]|S[EE bar(x).[]]∣S[" let "x:AA bar(x)[[]].T" in "C]∣S[" let "x:sigma" in []] "S::=\square|S[[] \wedge C]| S[\exists \overline{\mathrm{x}} .[]] \mid S[\text { let } \mathrm{x}: \forall \overline{\mathrm{x}}[[]] . \mathrm{T} \text { in } C] \mid S[\text { let } \mathrm{x}: \sigma \text { in []] }S::=|S[[]C]|S[x¯.[]]S[ let x:x¯[[]].T in C]S[ let x:σ in []] 
In the second and fourth productions, C C CCC is an external constraint. In the last production, we require σ σ sigma\sigmaσ to be of the form X ¯ [ U ] X ¯ [ U ] AA bar(X)[U]\forall \overline{\mathrm{X}}[U]X¯[U].X, and we demand σ σ EE sigma-=\exists \sigma \equivσ true. A stack may be viewed as a list of frames. Frames may be added and deleted at the inner end of a stack, that is, near the hole of the constraint context that it represents. We refer to the four kinds of frames as conjunction, existential, let, and environment frames, respectively. A state of the constraint solver is a triple S ; U ; C S ; U ; C S;U;CS ; U ; CS;U;C, where S S SSS is a stack, U U UUU is a unification constraint, and C C CCC is an external constraint. The state S ; U ; C S ; U ; C S;U;CS ; U ; CS;U;C is to be understood as a representation of the constraint S [ U C ] S [ U C ] S[U^^C]S[U \wedge C]S[UC]. The notion of α α alpha\alphaα-equivalence between states is defined accordingly. In particular, one may rename type variables in d t v ( S ) d t v ( S ) dtv(S)d t v(S)dtv(S), provided U U UUU and C C CCC are renamed as well. In short, the three components of a state play the following roles. C C CCC is an external constraint that the solver intends to examine next. U U UUU is the internal state of the underlying unification algorithm: one might think of it as the knowledge that has been obtained so far. S S SSS tells where the type variables that occur free in U U UUU and C C CCC are bound, associates type schemes with the program variables that occur free in C C CCC, and records what should be done after C C CCC is solved. The solver's initial state is usually of the form []; true; C C CCC, where C C CCC is the external constraint that one wishes to solve-that is, whose satisfiability one wishes to determine. For simplicity, we make the (unessential) assumption that states have no free type variables.
The solver consists of a (nondeterministic) state rewriting system, given in Figure 1-12. Rewriting is performed modulo α α alpha\alphaα-conversion. S-UNIFY makes the unification algorithm a component of the constraint solver, and allows the current unification problem U U UUU to be solved at any time. Rules S-Ex-1 to SEx-4 float existential quantifiers out of the unification problem into the stack, and through the stack up to the nearest enclosing let frame, if there is any, or to the outermost level, otherwise. Their side-conditions prevent capture of type variables, and may always be satisfied by suitable α α alpha\alphaα-conversion of the left-hand state. If S ; U ; C S ; U ; C S;U;CS ; U ; CS;U;C is a normal form with respect to the above five rules, then every type variable in d t v ( S ) d t v ( S ) dtv(S)d t v(S)dtv(S) is either universally quantified at a let frame, or existentially bound at the outermost level. (Recall that, by assumption, states have no free type variables.) In other words, provided these rules are applied in an eager fashion, there is no need for existential frames to appear in the machine representation of stacks. Instead, it suffices to maintain, at every let frame and at the outermost level, a list of the type variables that are bound at this point; and, conversely, to annotate every type variable in d t v ( S ) d t v ( S ) dtv(S)d t v(S)dtv(S) with an integer rank, which allows telling, in constant time, where the variable is bound: type variables of rank 0 are bound at the outermost level, and type variables of rank k 1 k 1 k >= 1k \geq 1k1 are bound at the k th k th  k^("th ")k^{\text {th }}kth  let frame down in the stack S S SSS. The code that accompanies this chapter adopts this convention. Ranks were
S ; U ; C S ; U ; C S;U;CS ; U ; CS;U;C rarr\rightarrow
S ; U ; C S ; U ; C S;U^(');CS ; U^{\prime} ; CS;U;C
if U U U U U rarrU^(')U \rightarrow U^{\prime}UU
S;U^(');C if U rarrU^(')| $S ; U^{\prime} ; C$ | | :--- | | if $U \rightarrow U^{\prime}$ |
(S-UNIFY)
S ; x ¯ . U ; C S ; x ¯ . U ; C S;EE bar(x).U;CS ; \exists \overline{\mathrm{x}} . U ; CS;x¯.U;C rarr\rightarrow
S [ X ¯ . [ ] ] ; U ; C S [ X ¯ . [ ] ] ; U ; C S[EE bar(X).[]];U;CS[\exists \overline{\mathrm{X}} .[]] ; U ; CS[X¯.[]];U;C
if x ¯ # ftv ( C ) x ¯ # ftv ( C ) bar(x)#ftv(C)\overline{\mathrm{x}} \# \operatorname{ftv}(C)x¯#ftv(C)
S[EE bar(X).[]];U;C if bar(x)#ftv(C)| $S[\exists \overline{\mathrm{X}} .[]] ; U ; C$ | | :--- | | if $\overline{\mathrm{x}} \# \operatorname{ftv}(C)$ |
( S E x 1 ) ( S E x 1 ) (S-Ex-1)(\mathrm{S}-\mathrm{Ex}-1)(SEx1)
S [ ( x ¯ ) C ] S [ ( x ¯ ) C ] S[(EE bar(x)*◻)^^C]S[(\exists \overline{\mathrm{x}} \cdot \square) \wedge C]S[(x¯)C] rarr\rightarrow
S [ X ¯ . ( C ) ] S [ X ¯ . ( C ) ] S[EE bar(X).(◻^^C)]S[\exists \overline{\mathrm{X}} .(\square \wedge C)]S[X¯.(C)]
if x ¯ # ftv ( C ) x ¯ # ftv ( C ) bar(x)#ftv(C)\overline{\mathrm{x}} \# \operatorname{ftv}(C)x¯#ftv(C)
S[EE bar(X).(◻^^C)] if bar(x)#ftv(C)| $S[\exists \overline{\mathrm{X}} .(\square \wedge C)]$ | | :--- | | if $\overline{\mathrm{x}} \# \operatorname{ftv}(C)$ |
( S E x 2 ) ( S E x 2 ) (S-Ex-2)(\mathrm{S}-\mathrm{Ex}-2)(SEx2)
S [ S [ S[S[S[ let x : X ¯ [ Y ¯ . [ ] ] . T X ¯ [ Y ¯ . [ ] ] . T AA bar(X)[EE bar(Y).[]].T\forall \overline{\mathrm{X}}[\exists \overline{\mathrm{Y}} .[]] . \mathrm{T}X¯[Y¯.[]].T in C ] C ] C]C]C] rarr\rightarrow
S [ S [ S[S[S[ let x : X ¯ Y ¯ [ [ ] ] T x : X ¯ Y ¯ [ [ ] ] T x:AA bar(X) bar(Y)[[]]*T\mathrm{x}: \forall \overline{\mathrm{X}} \overline{\mathrm{Y}}[[]] \cdot \mathrm{T}x:X¯Y¯[[]]T in C ] C ] C]C]C]
if Y ¯ # ftv ( T ) Y ¯ # ftv ( T ) bar(Y)#ftv(T)\overline{\mathrm{Y}} \# \operatorname{ftv}(\mathrm{T})Y¯#ftv(T)
S[ let x:AA bar(X) bar(Y)[[]]*T in C] if bar(Y)#ftv(T)| $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}} \overline{\mathrm{Y}}[[]] \cdot \mathrm{T}$ in $C]$ | | :--- | | if $\overline{\mathrm{Y}} \# \operatorname{ftv}(\mathrm{T})$ |
( S E x 3 ) ( S E x 3 ) (S-Ex-3)(\mathrm{S}-\mathrm{Ex}-3)(SEx3)
S [ S [ S[S[S[ let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in x ¯ . [ ] ] x ¯ . [ ] ] EE bar(x).[]]\exists \overline{\mathrm{x}} .[]]x¯.[]] rarr\rightarrow
S [ X ¯ S [ X ¯ S[EE bar(X)S[\exists \overline{\mathrm{X}}S[X¯.let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in [] ] ] ]]]
if x ¯ # ftv ( σ ) x ¯ # ftv ( σ ) bar(x)#ftv(sigma)\overline{\mathrm{x}} \# \operatorname{ftv}(\sigma)x¯#ftv(σ)
S[EE bar(X).let x:sigma in []] if bar(x)#ftv(sigma)| $S[\exists \overline{\mathrm{X}}$.let $\mathrm{x}: \sigma$ in []$]$ | | :--- | | if $\overline{\mathrm{x}} \# \operatorname{ftv}(\sigma)$ |
( S E x 4 ) ( S E x 4 ) (S-Ex-4)(\mathrm{S}-\mathrm{Ex}-4)(SEx4)
S ; U ; T 1 = T 2 S ; U ; T 1 = T 2 S;U;T_(1)=T_(2)S ; U ; \mathrm{T}_{1}=\mathrm{T}_{2}S;U;T1=T2 rarr\rightarrow S ; U T 1 = T 2 ; S ; U T 1 = T 2 ; S;U^^T_(1)=T_(2);S ; U \wedge \mathrm{T}_{1}=\mathrm{T}_{2} ;S;UT1=T2; true (S-SolvE-EQ)
S ; U ; x T S ; U ; x T S;U;x-<=TS ; U ; \mathrm{x} \preceq \mathrm{T}S;U;xT rarr\rightarrow S ; U ; S ( x ) T S ; U ; S ( x ) T S;U;S(x)-<=TS ; U ; S(\mathrm{x}) \preceq \mathrm{T}S;U;S(x)T (S-SoLVE-ID)
S ; U ; C 1 C 2 S ; U ; C 1 C 2 S;U;C_(1)^^C_(2)S ; U ; C_{1} \wedge C_{2}S;U;C1C2 rarr\rightarrow S [ [ ] C 2 ] ; U ; C 1 S [ ] C 2 ; U ; C 1 S[[]^^C_(2)];U;C_(1)S\left[[] \wedge C_{2}\right] ; U ; C_{1}S[[]C2];U;C1 (S-SolvE-And)
S ; U ; X ¯ . C S ; U ; X ¯ . C S;U;EE bar(X).CS ; U ; \exists \overline{\mathrm{X}} . CS;U;X¯.C rarr\rightarrow
S [ X ¯ . [ ] ] ; U ; C S [ X ¯ . [ ] ] ; U ; C S[EE bar(X).[]];U;CS[\exists \overline{\mathrm{X}} .[]] ; U ; CS[X¯.[]];U;C
if x ¯ # ftv ( U ) x ¯ # ftv ( U ) bar(x)#ftv(U)\overline{\mathrm{x}} \# \operatorname{ftv}(U)x¯#ftv(U)
S[EE bar(X).[]];U;C if bar(x)#ftv(U)| $S[\exists \overline{\mathrm{X}} .[]] ; U ; C$ | | :--- | | if $\overline{\mathrm{x}} \# \operatorname{ftv}(U)$ |
(S-SOLVE-Ex)
S ; U ; S ; U ; S;U;S ; U ;S;U; let x : x ¯ [ D ] . T x : x ¯ [ D ] . T x:AA bar(x)[D].T\mathrm{x}: \forall \overline{\mathrm{x}}[D] . \mathrm{T}x:x¯[D].T in C C CCC rarr\rightarrow
S [ S [ S[S[S[ let x : X ¯ [ ] . T x : X ¯ [ ] . T x:AA bar(X)[◻].T\mathrm{x}: \forall \overline{\mathrm{X}}[\mathrm{\square}] . \mathrm{T}x:X¯[].T in C ] ; U ; D C ] ; U ; D C];U;DC] ; U ; DC];U;D
if x ¯ # ftv ( U ) x ¯ # ftv ( U ) bar(x)#ftv(U)\overline{\mathrm{x}} \# \operatorname{ftv}(U)x¯#ftv(U)
S[ let x:AA bar(X)[◻].T in C];U;D if bar(x)#ftv(U)| $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}}[\mathrm{\square}] . \mathrm{T}$ in $C] ; U ; D$ | | :--- | | if $\overline{\mathrm{x}} \# \operatorname{ftv}(U)$ |
(S-SOLVE-LET)
S [ C ] ; U ; S [ C ] ; U ; S[◻^^C];U;S[\square \wedge C] ; U ;S[C];U; true rarr\rightarrow S ; U ; C S ; U ; C S;U;CS ; U ; CS;U;C (S-POP-AND)
S [ S [ S[S[S[ let x : x ¯ [ [ ] ] T x : x ¯ [ [ ] ] T x:AA bar(x)[[]]T\mathrm{x}: \forall \overline{\mathrm{x}}[\mathrm{[]}] \mathrm{T}x:x¯[[]]T in C ] ; U C ] ; U C];UC] ; UC];U; true rarr\rightarrow
S [ S [ S[S[S[ let x : x ¯ X [ [ ] ] X x : x ¯ X [ [ ] ] X x:AA bar(x)X[[]]X\mathrm{x}: \forall \overline{\mathrm{x}} \mathrm{X}[[]] \mathrm{X}x:x¯X[[]]X in C ] C ] C]C]C]
U X = T ; U X = T ; U^^X=T;U \wedge \mathrm{X}=\mathrm{T} ;UX=T; true
if x ftv ( U , T ) T V x ftv ( U , T ) T V x!in ftv(U,T)^^T!inV\mathrm{x} \notin \operatorname{ftv}(U, \mathrm{~T}) \wedge \mathrm{T} \notin \mathcal{V}xftv(U, T)TV
S[ let x:AA bar(x)X[[]]X in C] U^^X=T; true if x!in ftv(U,T)^^T!inV| $S[$ let $\mathrm{x}: \forall \overline{\mathrm{x}} \mathrm{X}[[]] \mathrm{X}$ in $C]$ | | :--- | | $U \wedge \mathrm{X}=\mathrm{T} ;$ true | | if $\mathrm{x} \notin \operatorname{ftv}(U, \mathrm{~T}) \wedge \mathrm{T} \notin \mathcal{V}$ |
( S N A M E 2 ) ( S N A M E 2 ) (S-NAME-2)(\mathrm{S}-\mathrm{NAME}-2)(SNAME2)
S [ S [ S[S[S[ let x : X ¯ Y [ [ ] ] . X x : X ¯ Y [ [ ] ] . X x:AA bar(X)Y[[]].X\mathrm{x}: \forall \overline{\mathrm{X}} \mathrm{Y}[[]] . \mathrm{X}x:X¯Y[[]].X in C ] ; Y = Z = ϵ U C ] ; Y = Z = ϵ U C];Y=Z=epsilon^^UC] ; \mathrm{Y}=\mathrm{Z}=\epsilon \wedge UC];Y=Z=ϵU; true rarr\rightarrow
S [ S [ S[S[S[ let x : X ¯ Y [ [ ] ] . θ ( X ) x : X ¯ Y [ [ ] ] . θ ( X ) x:AA bar(X)Y[[]].theta(X)\mathrm{x}: \forall \overline{\mathrm{X}} \mathrm{Y}[[]] . \theta(\mathrm{X})x:X¯Y[[]].θ(X) in C ] C ] C]C]C]
Y Z = θ ( ϵ ) θ ( U ) ; Y Z = θ ( ϵ ) θ ( U ) ; Y^^Z=theta(epsilon)^^theta(U);\mathrm{Y} \wedge \mathrm{Z}=\theta(\epsilon) \wedge \theta(U) ;YZ=θ(ϵ)θ(U); true
if Y Z θ = [ Y Z ] Y Z θ = [ Y Z ] Y!=Z^^theta=[Y|->Z]\mathrm{Y} \neq \mathrm{Z} \wedge \theta=[\mathrm{Y} \mapsto \mathrm{Z}]YZθ=[YZ]
S[ let x:AA bar(X)Y[[]].theta(X) in C] Y^^Z=theta(epsilon)^^theta(U); true if Y!=Z^^theta=[Y|->Z]| $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}} \mathrm{Y}[[]] . \theta(\mathrm{X})$ in $C]$ | | :--- | | $\mathrm{Y} \wedge \mathrm{Z}=\theta(\epsilon) \wedge \theta(U) ;$ true | | if $\mathrm{Y} \neq \mathrm{Z} \wedge \theta=[\mathrm{Y} \mapsto \mathrm{Z}]$ |
(S-COMpREss)
S [ S [ S[S[S[ let x : X ¯ Y [ [ ] . X x : X ¯ Y [ [ ] . X x:AA bar(X)Y[[].X\mathrm{x}: \forall \overline{\mathrm{X}} \mathrm{Y}[\mathrm{[}] . \mathrm{X}x:X¯Y[[].X in C ] ; Y = ϵ U C ] ; Y = ϵ U C];Y=epsilon^^UC] ; \mathrm{Y}=\epsilon \wedge UC];Y=ϵU; true rarr\rightarrow
S [ S [ S[S[S[ let x : X ¯ [ ] ] X x : X ¯ [ ] ] X x:AA bar(X)[]]*X\mathrm{x}: \forall \overline{\mathrm{X}}[\mathrm{]}] \cdot \mathrm{X}x:X¯[]]X in C ] ; ϵ U C ] ; ϵ U C];epsilon^^UC] ; \epsilon \wedge UC];ϵU; true
if Y x ftv ( ϵ , U ) Y x ftv ( ϵ , U ) Y!inxuu ftv(epsilon,U)\mathrm{Y} \notin \mathrm{x} \cup \operatorname{ftv}(\epsilon, U)Yxftv(ϵ,U)
S[ let x:AA bar(X)[]]*X in C];epsilon^^U; true if Y!inxuu ftv(epsilon,U)| $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}}[\mathrm{]}] \cdot \mathrm{X}$ in $C] ; \epsilon \wedge U$; true | | :--- | | if $\mathrm{Y} \notin \mathrm{x} \cup \operatorname{ftv}(\epsilon, U)$ |
(S-UnNAME)
S [ S [ S[S[S[ let x : X ¯ Y ¯ [ [ ] ] . X x : X ¯ Y ¯ [ [ ] ] . X x:AA bar(X) bar(Y)[[]].X\mathrm{x}: \forall \overline{\mathrm{X}} \overline{\mathrm{Y}}[[]] . \mathrm{X}x:X¯Y¯[[]].X in C ] ; U ; C ] ; U ; C];U;C] ; U ;C];U; true rarr\rightarrow
S [ Y ¯ S [ Y ¯ S[EE bar(Y)S[\exists \overline{\mathrm{Y}}S[Y¯.let x : X ¯ [ ] . X x : X ¯ [ ] . X x:AA bar(X)[◻].X\mathrm{x}: \forall \overline{\mathrm{X}}[\square] . \mathrm{X}x:X¯[].X in C ] ; U ; C ] ; U ; C];U;C] ; U ;C];U; true
if Y ¯ # ftv ( C ) X ¯ . U Y ¯ # ftv ( C ) X ¯ . U bar(Y)#ftv(C)^^EE bar(X).U\overline{\mathrm{Y}} \# \operatorname{ftv}(C) \wedge \exists \overline{\mathrm{X}} . UY¯#ftv(C)X¯.U determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯
S[EE bar(Y).let x:AA bar(X)[◻].X in C];U; true if bar(Y)#ftv(C)^^EE bar(X).U determines bar(Y)| $S[\exists \overline{\mathrm{Y}}$.let $\mathrm{x}: \forall \overline{\mathrm{X}}[\square] . \mathrm{X}$ in $C] ; U ;$ true | | :--- | | if $\overline{\mathrm{Y}} \# \operatorname{ftv}(C) \wedge \exists \overline{\mathrm{X}} . U$ determines $\overline{\mathrm{Y}}$ |
(S-LETALL)
S [ S [ S[S[S[ let x : x ¯ [ [ ] ] . x x ¯ [ [ ] ] . x AA bar(x)[[]].x\forall \overline{\mathrm{x}}[[]] . \mathrm{x}x¯[[]].x in C ] ; U 1 U 2 ; C ] ; U 1 U 2 ; C];U_(1)^^U_(2);C] ; U_{1} \wedge U_{2} ;C];U1U2; true rarr\rightarrow
S [ S S[:}S\left[\right.S[ let x : x ¯ [ U 2 ] X x ¯ U 2 X AA bar(x)[U_(2)]*X\forall \overline{\mathrm{x}}\left[U_{2}\right] \cdot \mathrm{X}x¯[U2]X in ] ; U 1 ; C ; U 1 ; C {:◻];U_(1);C\left.\square\right] ; U_{1} ; C];U1;C
if x ¯ # ftv ( U 1 ) x ¯ U 2 x ¯ # ftv U 1 x ¯ U 2 bar(x)#ftv(U_(1))^^EE bar(x)*U_(2)-=\overline{\mathrm{x}} \# \operatorname{ftv}\left(U_{1}\right) \wedge \exists \overline{\mathrm{x}} \cdot U_{2} \equivx¯#ftv(U1)x¯U2 true
S[:} let x : AA bar(x)[U_(2)]*X in {:◻];U_(1);C if bar(x)#ftv(U_(1))^^EE bar(x)*U_(2)-= true| $S\left[\right.$ let x : $\forall \overline{\mathrm{x}}\left[U_{2}\right] \cdot \mathrm{X}$ in $\left.\square\right] ; U_{1} ; C$ | | :--- | | if $\overline{\mathrm{x}} \# \operatorname{ftv}\left(U_{1}\right) \wedge \exists \overline{\mathrm{x}} \cdot U_{2} \equiv$ true |
(S-POP-LET)
S [ S [ S[S[S[ let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in [] ] ; U ] ; U ];U] ; U];U; true rarr\rightarrow S ; U ; S ; U ; S;U;S ; U ;S;U; true (S-Pop-Env)
S;U;C rarr "S;U^(');C if U rarrU^(')" (S-UNIFY) S;EE bar(x).U;C rarr "S[EE bar(X).[]];U;C if bar(x)#ftv(C)" (S-Ex-1) S[(EE bar(x)*◻)^^C] rarr "S[EE bar(X).(◻^^C)] if bar(x)#ftv(C)" (S-Ex-2) S[ let x : AA bar(X)[EE bar(Y).[]].T in C] rarr "S[ let x:AA bar(X) bar(Y)[[]]*T in C] if bar(Y)#ftv(T)" (S-Ex-3) S[ let x:sigma in EE bar(x).[]] rarr "S[EE bar(X).let x:sigma in []] if bar(x)#ftv(sigma)" (S-Ex-4) S;U;T_(1)=T_(2) rarr S;U^^T_(1)=T_(2); true (S-SolvE-EQ) S;U;x-<=T rarr S;U;S(x)-<=T (S-SoLVE-ID) S;U;C_(1)^^C_(2) rarr S[[]^^C_(2)];U;C_(1) (S-SolvE-And) S;U;EE bar(X).C rarr "S[EE bar(X).[]];U;C if bar(x)#ftv(U)" (S-SOLVE-Ex) S;U; let x:AA bar(x)[D].T in C rarr "S[ let x:AA bar(X)[◻].T in C];U;D if bar(x)#ftv(U)" (S-SOLVE-LET) S[◻^^C];U; true rarr S;U;C (S-POP-AND) S[ let x:AA bar(x)[[]]T in C];U; true rarr "S[ let x:AA bar(x)X[[]]X in C] U^^X=T; true if x!in ftv(U,T)^^T!inV" (S-NAME-2) S[ let x:AA bar(X)Y[[]].X in C];Y=Z=epsilon^^U; true rarr "S[ let x:AA bar(X)Y[[]].theta(X) in C] Y^^Z=theta(epsilon)^^theta(U); true if Y!=Z^^theta=[Y|->Z]" (S-COMpREss) S[ let x:AA bar(X)Y[[].X in C];Y=epsilon^^U; true rarr "S[ let x:AA bar(X)[]]*X in C];epsilon^^U; true if Y!inxuu ftv(epsilon,U)" (S-UnNAME) S[ let x:AA bar(X) bar(Y)[[]].X in C];U; true rarr "S[EE bar(Y).let x:AA bar(X)[◻].X in C];U; true if bar(Y)#ftv(C)^^EE bar(X).U determines bar(Y)" (S-LETALL) S[ let x : AA bar(x)[[]].x in C];U_(1)^^U_(2); true rarr "S[:} let x : AA bar(x)[U_(2)]*X in {:◻];U_(1);C if bar(x)#ftv(U_(1))^^EE bar(x)*U_(2)-= true" (S-POP-LET) S[ let x:sigma in []];U; true rarr S;U; true (S-Pop-Env)| $S ; U ; C$ | $\rightarrow$ | $S ; U^{\prime} ; C$ <br> if $U \rightarrow U^{\prime}$ | (S-UNIFY) | | :---: | :---: | :---: | :---: | | $S ; \exists \overline{\mathrm{x}} . U ; C$ | $\rightarrow$ | $S[\exists \overline{\mathrm{X}} .[]] ; U ; C$ <br> if $\overline{\mathrm{x}} \# \operatorname{ftv}(C)$ | $(\mathrm{S}-\mathrm{Ex}-1)$ | | $S[(\exists \overline{\mathrm{x}} \cdot \square) \wedge C]$ | $\rightarrow$ | $S[\exists \overline{\mathrm{X}} .(\square \wedge C)]$ <br> if $\overline{\mathrm{x}} \# \operatorname{ftv}(C)$ | $(\mathrm{S}-\mathrm{Ex}-2)$ | | $S[$ let x : $\forall \overline{\mathrm{X}}[\exists \overline{\mathrm{Y}} .[]] . \mathrm{T}$ in $C]$ | $\rightarrow$ | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}} \overline{\mathrm{Y}}[[]] \cdot \mathrm{T}$ in $C]$ <br> if $\overline{\mathrm{Y}} \# \operatorname{ftv}(\mathrm{T})$ | $(\mathrm{S}-\mathrm{Ex}-3)$ | | $S[$ let $\mathrm{x}: \sigma$ in $\exists \overline{\mathrm{x}} .[]]$ | $\rightarrow$ | $S[\exists \overline{\mathrm{X}}$.let $\mathrm{x}: \sigma$ in []$]$ <br> if $\overline{\mathrm{x}} \# \operatorname{ftv}(\sigma)$ | $(\mathrm{S}-\mathrm{Ex}-4)$ | | $S ; U ; \mathrm{T}_{1}=\mathrm{T}_{2}$ | $\rightarrow$ | $S ; U \wedge \mathrm{T}_{1}=\mathrm{T}_{2} ;$ true | (S-SolvE-EQ) | | $S ; U ; \mathrm{x} \preceq \mathrm{T}$ | $\rightarrow$ | $S ; U ; S(\mathrm{x}) \preceq \mathrm{T}$ | (S-SoLVE-ID) | | $S ; U ; C_{1} \wedge C_{2}$ | $\rightarrow$ | $S\left[[] \wedge C_{2}\right] ; U ; C_{1}$ | (S-SolvE-And) | | $S ; U ; \exists \overline{\mathrm{X}} . C$ | $\rightarrow$ | $S[\exists \overline{\mathrm{X}} .[]] ; U ; C$ <br> if $\overline{\mathrm{x}} \# \operatorname{ftv}(U)$ | (S-SOLVE-Ex) | | $S ; U ;$ let $\mathrm{x}: \forall \overline{\mathrm{x}}[D] . \mathrm{T}$ in $C$ | $\rightarrow$ | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}}[\mathrm{\square}] . \mathrm{T}$ in $C] ; U ; D$ <br> if $\overline{\mathrm{x}} \# \operatorname{ftv}(U)$ | (S-SOLVE-LET) | | $S[\square \wedge C] ; U ;$ true | $\rightarrow$ | $S ; U ; C$ | (S-POP-AND) | | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{x}}[\mathrm{[]}] \mathrm{T}$ in $C] ; U$; true | $\rightarrow$ | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{x}} \mathrm{X}[[]] \mathrm{X}$ in $C]$ <br> $U \wedge \mathrm{X}=\mathrm{T} ;$ true <br> if $\mathrm{x} \notin \operatorname{ftv}(U, \mathrm{~T}) \wedge \mathrm{T} \notin \mathcal{V}$ | $(\mathrm{S}-\mathrm{NAME}-2)$ | | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}} \mathrm{Y}[[]] . \mathrm{X}$ in $C] ; \mathrm{Y}=\mathrm{Z}=\epsilon \wedge U$; true | $\rightarrow$ | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}} \mathrm{Y}[[]] . \theta(\mathrm{X})$ in $C]$ <br> $\mathrm{Y} \wedge \mathrm{Z}=\theta(\epsilon) \wedge \theta(U) ;$ true <br> if $\mathrm{Y} \neq \mathrm{Z} \wedge \theta=[\mathrm{Y} \mapsto \mathrm{Z}]$ | (S-COMpREss) | | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}} \mathrm{Y}[\mathrm{[}] . \mathrm{X}$ in $C] ; \mathrm{Y}=\epsilon \wedge U$; true | $\rightarrow$ | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}}[\mathrm{]}] \cdot \mathrm{X}$ in $C] ; \epsilon \wedge U$; true <br> if $\mathrm{Y} \notin \mathrm{x} \cup \operatorname{ftv}(\epsilon, U)$ | (S-UnNAME) | | $S[$ let $\mathrm{x}: \forall \overline{\mathrm{X}} \overline{\mathrm{Y}}[[]] . \mathrm{X}$ in $C] ; U ;$ true | $\rightarrow$ | $S[\exists \overline{\mathrm{Y}}$.let $\mathrm{x}: \forall \overline{\mathrm{X}}[\square] . \mathrm{X}$ in $C] ; U ;$ true <br> if $\overline{\mathrm{Y}} \# \operatorname{ftv}(C) \wedge \exists \overline{\mathrm{X}} . U$ determines $\overline{\mathrm{Y}}$ | (S-LETALL) | | $S[$ let x : $\forall \overline{\mathrm{x}}[[]] . \mathrm{x}$ in $C] ; U_{1} \wedge U_{2} ;$ true | $\rightarrow$ | $S\left[\right.$ let x : $\forall \overline{\mathrm{x}}\left[U_{2}\right] \cdot \mathrm{X}$ in $\left.\square\right] ; U_{1} ; C$ <br> if $\overline{\mathrm{x}} \# \operatorname{ftv}\left(U_{1}\right) \wedge \exists \overline{\mathrm{x}} \cdot U_{2} \equiv$ true | (S-POP-LET) | | $S[$ let $\mathrm{x}: \sigma$ in []$] ; U$; true | $\rightarrow$ | $S ; U ;$ true | (S-Pop-Env) |
Figure 1-12: A constraint solver
initially described in (Rémy, 1992a), and also appear in (McAllester, 2003).
Rules S-Solve-EQ to S-SoLVE-LET encode an analysis of the structure of the third component of the current state. There is one rule for each possible case, except false, which by assumption cannot arise, and true, which is dealt with further on. S-SolvE-EQ discovers an equation and makes it available to the unification algorithm. S-SOLVE-ID discovers an instantiation constraint x T x T x-<=T\mathrm{x} \preceq \mathrm{T}xT and replaces it with σ T σ T sigma-<=T\sigma \preceq \mathrm{T}σT, where the type scheme σ = S ( x ) σ = S ( x ) sigma=S(x)\sigma=S(\mathrm{x})σ=S(x) is the type scheme carried by the nearest environment frame that defines x x x\mathrm{x}x in the stack S S SSS. It is defined as follows:
S [ [ ] C ] ( x ) = S ( x ) S [ x ¯ . ] ( x ) = S ( x ) if x ¯ # ftv ( S ( x ) ) S [ let y : x ¯ [ ] . T in C ] ( x ) = S ( x ) if x ¯ # ftv ( S ( x ) ) S [ let y : σ in ] ( x ) = S ( x ) if x y S [ let x : σ in ] ( x ) = σ S [ [ ] C ] ( x ) = S ( x ) S [ x ¯ . ] ( x ) = S ( x )  if  x ¯ # ftv ( S ( x ) ) S [  let y  : x ¯ [ ] . T  in  C ] ( x ) = S ( x )  if  x ¯ # ftv ( S ( x ) ) S [  let  y : σ  in  ] ( x ) = S ( x )  if  x y S [  let  x : σ  in  ] ( x ) = σ {:[S[[]^^C](x)=S(x)],[S[EE bar(x).◻](x)=S(x)" if " bar(x)#ftv(S(x))],[S[" let y ":AA bar(x)[◻].T" in "C](x)=S(x)" if " bar(x)#ftv(S(x))],[S[" let "y:sigma" in "◻](x)=S(x)" if "x!=y],[S[" let "x:sigma" in "◻](x)=sigma]:}\begin{aligned} & S[[] \wedge C](\mathrm{x})=S(\mathrm{x}) \\ & S[\exists \overline{\mathrm{x}} . \square](\mathrm{x})=S(\mathrm{x}) \text { if } \overline{\mathrm{x}} \# \operatorname{ftv}(S(\mathrm{x})) \\ & S[\text { let y }: \forall \overline{\mathrm{x}}[\mathrm{\square}] . \mathrm{T} \text { in } C](\mathrm{x})=S(\mathrm{x}) \text { if } \overline{\mathrm{x}} \# \operatorname{ftv}(S(\mathrm{x})) \\ & S[\text { let } \mathrm{y}: \sigma \text { in } \square](\mathrm{x})=S(\mathrm{x}) \text { if } \mathrm{x} \neq \mathrm{y} \\ & S[\text { let } \mathrm{x}: \sigma \text { in } \square](\mathrm{x})=\sigma \end{aligned}S[[]C](x)=S(x)S[x¯.](x)=S(x) if x¯#ftv(S(x))S[ let y :x¯[].T in C](x)=S(x) if x¯#ftv(S(x))S[ let y:σ in ](x)=S(x) if xyS[ let x:σ in ](x)=σ
If x d p i ( S ) x d p i ( S ) xin dpi(S)\mathrm{x} \in d p i(S)xdpi(S) does not hold, then S ( x ) S ( x ) S(x)S(\mathrm{x})S(x) is undefined and the rule is not applicable. If it does hold, then the rule may always be made applicable by suitable α α alpha\alphaα-conversion of the left-hand state. Please recall that, if σ σ sigma\sigmaσ is of the form X ¯ [ U ] . X X ¯ [ U ] . X AA bar(X)[U].X\forall \overline{\mathrm{X}}[U] . \mathrm{X}X¯[U].X, where X ¯ # ftv ( T ) X ¯ # ftv ( T ) bar(X)#ftv(T)\overline{\mathrm{X}} \# \operatorname{ftv}(\mathrm{T})X¯#ftv(T), then σ T σ T sigma-<=T\sigma \preceq \mathrm{T}σT stands for X ¯ . ( U X = T ) X ¯ . ( U X = T ) EE bar(X).(U^^X=T)\exists \overline{\mathrm{X}} .(U \wedge \mathrm{X}=\mathrm{T})X¯.(UX=T). The process of constructing this constraint is informally referred to as "taking an instance of σ σ sigma\sigmaσ ". It involves taking fresh copies of the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, of the unification constraint U U UUU, and of the body X X X\mathrm{X}X. In the worst case, this process is just as inefficient as textually expanding the corresponding let construct in the program's source code, and leads to exponential time complexity (Mairson, Kanellakis, and Mitchell, 1991). In practice, however, the unification constraint U U UUU is often compact, because it was simplified before the environment frame let x : σ x : σ x:sigma\mathrm{x}: \sigmax:σ in [] was created. which is why the solver usually performs well. (The creation of environment frames, performed by S-POP-LET, is discussed below.) S-SOLVEAND discovers a conjunction. It arbitrarily chooses to explore the left branch first, and pushes a conjunction frame onto the stack, so as to record that the right branch should be explored afterwards. S-SOLVE-Ex discovers an existential quantifier and enters it, creating a new existential frame to record its existence. Similarly, S-SOLVE-LET discovers a let form and enters its left-hand side, creating a new let frame to record its existence. The choice of examining the left-hand side first is not arbitrary. Indeed, examining the right-hand side first would require creating an environment frame-but environment frames must contain simplified type schemes of the form X ¯ [ U ] . X X ¯ [ U ] . X AA bar(X)[U].X\forall \overline{\mathrm{X}}[U] . \mathrm{X}X¯[U].X, whereas the type scheme X ¯ [ D ] . T X ¯ [ D ] . T AA bar(X)[D].T\forall \overline{\mathrm{X}}[D] . TX¯[D].T is arbitrary. In other words, our strategy is to simplify type schemes prior to allowing them to be copied by S-SOLVE-ID, so as to avoid any duplication of effort. The side-conditions of S-SOLVE-Ex and S-SOLVE-LET may always be satisfied by suitable α α alpha\alphaα-conversion of the left-hand state.
Rules S-Solve-EQ to S-Solve-LEt may be referred to as forward rules, because they "move down into" the external constraint, causing the stack to grow. This process stops when the external constraint at hand becomes true. Then, part of the work has been finished, and the solver must examine the stack in order to determine what to do next. This task is performed by the last series of rules, which may be referred to as backward rules, because they "move back out", causing the stack to shrink, and possibly scheduling new external constraints for examination. These rules encode an analysis of the structure of the innermost stack frame. There are three cases, corresponding to conjunction, let, and environment frames. The case of existential stack frames need not be considered, because rules S-Ex-2 to S-Ex-4 allow either fusing them with let frames or floating them up to the outermost level, where they shall remain inert. S-POP-And deals with conjunction frames. The frame is popped, and the external constraint that it carries is scheduled for examination. S-Pop-Env deals with environment frames. Because the right-hand side of the let construct at hand has been solved - that is, turned into a unification constraint U U UUU-it cannot contain an occurrence of x x x\mathrm{x}x. Furthermore, by assumption, σ σ EE sigma\exists \sigmaσ is true. Thus, this environment frame is no longer useful: it is destroyed. The remaining rules deal with let frames. Roughly speaking, their purpose is to change the state S [ S [ S[S[S[ let x : x ¯ [ [ ] ] . T x : x ¯ [ [ ] ] . T x:AA bar(x)[[]].T\mathrm{x}: \forall \overline{\mathrm{x}}[[]] . Tx:x¯[[]].T in C ] ; U C ] ; U C];UC] ; UC];U; true into S [ S [ S[S[S[ let x : x ¯ [ U ] . T x : x ¯ [ U ] . T x:AA bar(x)[U].T\mathrm{x}: \forall \overline{\mathrm{x}}[U] . Tx:x¯[U].T in [] ] ] ]]]; true; C C CCC, that is, to turn the current unification constraint U U UUU into a type scheme, turn the let frame into an environment frame, and schedule the right-hand side of the let construct (that is, the external constraint C C CCC ) for examination. In fact, the process is more complex, because the type scheme X ¯ [ U ] X ¯ [ U ] AA bar(X)[U]\forall \overline{\mathrm{X}}[U]X¯[U].T must be simplified before becoming part of an environment frame. The simplification process is described by rules S-NAME-2 to S-Pop-LET. In the following, we refer to type variables in X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ as young and to type variables in d t v ( S ) X ¯ d t v ( S ) X ¯ dtv(S)\\ bar(X)d t v(S) \backslash \overline{\mathrm{X}}dtv(S)X¯ as old. The former are the universal quantifiers of the type scheme that is being created; the latter are its free type variables.
S-NAME-2 ensures that the body T T TTT of the type scheme that is being created is a type variable, as opposed to an arbitrary term. If it isn't, then it is replaced with a fresh variable X X X\mathrm{X}X, and the equation X = T X = T X=T\mathrm{X}=\mathrm{T}X=T is added so as to recall that X X X\mathrm{X}X stands for T T T\mathrm{T}T. Thus, the rule moves the term T T T\mathrm{T}T into the current unification problem, where it potentially becomes subject to S-NAME-1. This ensures that sharing is made explicit everywhere. S-COMPRESS determines that the (young) type variable Y Y Y\mathrm{Y}Y is an alias for the type variable Z Z Z\mathrm{Z}Z. Then, every free occurrence of Y other than its defining occurrence is replaced with Z. In an actual implementation, this occurs transparently when the union-find algorithm performs path compression (Tarjan, 1975, 1979), provided we are careful never to create a link from a variable to a variable of higher rank. This requires making the unification algorithm aware of ranks, but is otherwise
easily achieved. S-UNNAME determines that the (young) type variable Y has no occurrences other than its defining occurrence in the current type scheme. (This occurs, in particular, when S-Compress has just been applied.) Then, Y Y Y\mathrm{Y}Y is suppressed altogether. In the particular case where the remaining multiequation ϵ ϵ epsilon\epsilonϵ has cardinal 1, it may then be suppressed by S-SingLE. In other words, the combination of S-UNNAME and S-SINGLE is able to suppress young unused type variables as well as the term that they stand for. This may, in turn, cause new type variables to become eligible for elimination by S S S\mathrm{S}S UNNAME. In fact, assuming the current unification constraint is acyclic, an inductive argument shows that every young type variable may be suppressed unless it is dominated either by X X X\mathrm{X}X or by an old type variable. (In the setting of a regular tree model, it is possible to extend the rule so that young cycles that are not dominated either by X X X\mathrm{X}X or by an old type variable are suppressed as well.) S-LETALL is a directed version of C-LETALL. It turns the young type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ into old variables. How to tell whether X ¯ . U X ¯ . U EE bar(X).U\exists \overline{\mathrm{X}} . UX¯.U determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ is discussed later (see Lemma 1.8.7). Why S-LETALL is an interesting and important rule will be explained shortly. S-POP-LET is meant to be applied when the current state has become a normal form with respect to S-UNIFY, SName-2, S-Compress, S-UnName, and S-LetAll, that is, when the type scheme that is about to be created is fully simplified. It splits the current unification constraint into two components U 1 U 1 U_(1)U_{1}U1 and U 2 U 2 U_(2)U_{2}U2, where U 1 U 1 U_(1)U_{1}U1 is made up entirely of old variables - as expressed by the side-condition X ¯ # ftv ( U 1 ) X ¯ # ftv U 1 bar(X)#ftv(U_(1))\overline{\mathrm{X}} \# \operatorname{ftv}\left(U_{1}\right)X¯#ftv(U1) and U 2 U 2 U_(2)U_{2}U2 constrains young variables only-as expressed by the side-condition X ¯ U 2 X ¯ U 2 EE bar(X)*U_(2)-=\exists \overline{\mathrm{X}} \cdot U_{2} \equivX¯U2 true. Please note that U 2 U 2 U_(2)U_{2}U2 may still contain free occurrences of old type variables, so the type scheme X ¯ [ U 2 ] X ¯ U 2 AA bar(X)[U_(2)]\forall \overline{\mathrm{X}}\left[U_{2}\right]X¯[U2]. X X X\mathrm{X}X that appears on the right-hand side is not necessarily closed. It is not obvious why such a decomposition must exist; the proof of Lemma 1.8.11 sheds more light on this issue. Let us say, for now, that S-LETALL plays a role in guaranteeing its existence, whence part of its importance. Once the decomposition U 1 U 2 U 1 U 2 U_(1)^^U_(2)U_{1} \wedge U_{2}U1U2 is obtained, the behavior of S S S\mathrm{S}S POP-LET is simple. The unification constraint U 1 U 1 U_(1)U_{1}U1 concerns old variables only, that is, variables that are not quantified in the current let frame; thus, it need not become part of the new type scheme, and may instead remain part of the current unification constraint. This is justified by C-LETAnd and C-InAnd* (see the proof of Lemma 1.8.10) and corresponds to the difference between HMX-GEN' and HMX-GEN discussed in Section 1.4. The unification constraint U 2 U 2 U_(2)U_{2}U2, on the other hand, becomes part of the newly built type scheme X ¯ [ U 2 ] X ¯ U 2 AA bar(X)[U_(2)]\forall \overline{\mathrm{X}}\left[U_{2}\right]X¯[U2].X. The property X ¯ . U 2 X ¯ . U 2 EE bar(X).U_(2)-=\exists \overline{\mathrm{X}} . U_{2} \equivX¯.U2 true guarantees that the newly created environment frame meets the requirements imposed on such frames. Please note that, the more type variables are considered old, the larger U 1 U 1 U_(1)U_{1}U1 may become, and the smaller U 2 U 2 U_(2)U_{2}U2. This is another reason why S-LETALL is interesting: by allowing more variables to be considered old, it decreases the size of the type scheme
X ¯ [ U 2 ] . X X ¯ U 2 . X AA bar(X)[U_(2)].X\forall \overline{\mathrm{X}}\left[U_{2}\right] . \mathrm{X}X¯[U2].X, making it cheaper to take instances of.
To complete our description of the constraint solver, there remains to explain how to decide when X ¯ . U X ¯ . U EE bar(X).U\exists \overline{\mathrm{X}} . UX¯.U determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯, since this predicate occurs in the side-condition of S-LETALL. The following lemma describes two important situations where, by examining the structure of an equation, it is possible to discover that a constraint C C CCC determines some of its free type variables Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ (Definition 1.3.26). In the first situation, the type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ are equated with or dominated by a distinct type variable X that occurs free in C C CCC. In that case, because the model is a free tree model, the values of the type variables Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ are determined by the value of X X XXX - they are subtrees of it at specific positions. For instance, X = Y 1 Y 2 X = Y 1 Y 2 X=Y_(1)rarrY_(2)\mathrm{X}=\mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2}X=Y1Y2 determines Y 1 Y 2 Y 1 Y 2 Y_(1)Y_(2)\mathrm{Y}_{1} \mathrm{Y}_{2}Y1Y2, while Y 1 . ( X = Y 1 Y 2 ) Y 1 . X = Y 1 Y 2 EEY_(1).(X=Y_(1)rarrY_(2))\exists \mathrm{Y}_{1} .\left(\mathrm{X}=\mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2}\right)Y1.(X=Y1Y2) determines Y 2 Y 2 Y_(2)\mathrm{Y}_{2}Y2. In the second situation, the type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ are equated with a term T, all of whose free type variables are free in C C CCC. Again, the value of the type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ is then determined by the values of the type variables f t v ( T ) f t v ( T ) ftv(T)f t v(T)ftv(T)-indeed, the term T T T\mathrm{T}T itself defines a function that maps the latter to the former. For instance, X = Y 1 Y 2 X = Y 1 Y 2 X=Y_(1)rarrY_(2)\mathrm{X}=\mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2}X=Y1Y2 determines X X X\mathrm{X}X, while Y 1 Y 1 EEY_(1)\exists \mathrm{Y}_{1}Y1. ( X = Y 1 Y 2 ) X = Y 1 Y 2 (X=Y_(1)rarrY_(2))\left(\mathrm{X}=\mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2}\right)(X=Y1Y2) does not. In the second situation, no assumption is in fact made about the model. Please note that X = Y 1 Y 2 X = Y 1 Y 2 X=Y_(1)rarrY_(2)\mathrm{X}=\mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2}X=Y1Y2 determines Y 1 Y 2 Y 1 Y 2 Y_(1)Y_(2)\mathrm{Y}_{1} \mathrm{Y}_{2}Y1Y2 and determines X X X\mathrm{X}X, but does not simultaneously determine X Y 1 Y 2 X Y 1 Y 2 XY_(1)Y_(2)\mathrm{XY}_{1} \mathrm{Y}_{2}XY1Y2.
1.8.7 Lemma: Let X ¯ # Y ¯ X ¯ # Y ¯ bar(X)# bar(Y)\overline{\mathrm{X}} \# \overline{\mathrm{Y}}X¯#Y¯. Assume either ϵ ϵ epsilon\epsilonϵ is X = ϵ X = ϵ X=epsilon^(')\mathrm{X}=\epsilon^{\prime}X=ϵ, where X X ¯ Y ¯ X X ¯ Y ¯ X!in bar(X) bar(Y)\mathrm{X} \notin \overline{\mathrm{X}} \overline{\mathrm{Y}}XX¯Y¯ and Y ¯ f t v ( ϵ ) Y ¯ f t v ϵ bar(Y)sube ftv(epsilon^('))\overline{\mathrm{Y}} \subseteq f t v\left(\epsilon^{\prime}\right)Y¯ftv(ϵ), or ϵ ϵ epsilon\epsilonϵ is Y ¯ = T = ϵ Y ¯ = T = ϵ bar(Y)=T=epsilon^(')\overline{\mathrm{Y}}=\mathrm{T}=\epsilon^{\prime}Y¯=T=ϵ, where ftv ( T ) # X ¯ Y ¯ ftv ( T ) # X ¯ Y ¯ ftv(T)# bar(X) bar(Y)\operatorname{ftv}(\mathrm{T}) \# \overline{\mathrm{X}} \overline{\mathrm{Y}}ftv(T)#X¯Y¯. Then, X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. ( C ϵ ) ( C ϵ ) (C^^epsilon)(C \wedge \epsilon)(Cϵ) determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯.
Proof: Let X ¯ # Y ¯ X ¯ # Y ¯ bar(X)# bar(Y)\overline{\mathrm{X}} \# \overline{\mathrm{Y}}X¯#Y¯ (1). Let ϕ def Γ ϕ def Γ phi|--def Gamma\phi \vdash \operatorname{def} \GammaϕdefΓ in X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. ( C ϵ ) ( C ϵ ) (C^^epsilon)(C \wedge \epsilon)(Cϵ) (2) and ϕ ϕ phi^(')|--\phi^{\prime} \vdashϕ def Γ Γ Gamma\GammaΓ in X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. ( C ϵ ) ( 3 ) ( C ϵ ) ( 3 ) (C^^epsilon)(3)(C \wedge \epsilon)(3)(Cϵ)(3), where ϕ ϕ phi\phiϕ and ϕ ϕ phi^(')\phi^{\prime}ϕ coincide outside of Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯. We may assume, w.l.o.g., x ¯ # f t v ( Γ ) x ¯ # f t v ( Γ ) bar(x)#ftv(Gamma)\overline{\mathrm{x}} \# \mathrm{ftv}(\Gamma)x¯#ftv(Γ) (4). By (2), (4), CM-Exists, and CM-And, we obtain ϕ 1 def Γ ϕ 1 def Γ phi_(1)|--def Gamma\phi_{1} \vdash \operatorname{def} \Gammaϕ1defΓ in ϵ ( 5 ) ϵ ( 5 ) epsilon(5)\epsilon(5)ϵ(5), where ϕ ϕ phi\phiϕ and ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 coincide outside x ¯ x ¯ bar(x)\overline{\mathrm{x}}x¯. By CM-Predicate, (5) implies that all members of ϵ ϵ epsilon\epsilonϵ have the same image through ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1. Similarly, exploiting (3) and (4), we find that all members of ϵ ϵ epsilon\epsilonϵ have the same image through ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1, where ϕ ϕ phi^(')\phi^{\prime}ϕ and ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1 coincide outside X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. Now, we claim that ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 and ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1 coincide on Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯. Once the claim is established, by (1), there follows that ϕ ϕ phi\phiϕ and ϕ ϕ phi^(')\phi^{\prime}ϕ must coincide on Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ as well, which is the goal. So, there only remains to establish the claim; we distinguish two subcases.
Subcase ϵ ϵ epsilon\epsilonϵ is X = ϵ X = ϵ X=epsilon^(')\mathrm{X}=\epsilon^{\prime}X=ϵ and X X ¯ Y ¯ ( 6 ) X X ¯ Y ¯ ( 6 ) X!in bar(X) bar(Y)(6)\mathrm{X} \notin \overline{\mathrm{X}} \overline{\mathrm{Y}}(\mathbf{6})XX¯Y¯(6) and Y ¯ f t v ( ϵ ) Y ¯ f t v ϵ bar(Y)sube ftv(epsilon^('))\overline{\mathrm{Y}} \subseteq f t v\left(\epsilon^{\prime}\right)Y¯ftv(ϵ) (7). Because ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 and ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1 coincide outside X ¯ Y ¯ X ¯ Y ¯ bar(X) bar(Y)\overline{\mathrm{X}} \overline{\mathrm{Y}}X¯Y¯ and by (6), we have ϕ 1 ( X ) = ϕ 1 ( X ) ϕ 1 ( X ) = ϕ 1 ( X ) phi_(1)(X)=phi_(1)^(')(X)\phi_{1}(\mathrm{X})=\phi_{1}^{\prime}(\mathrm{X})ϕ1(X)=ϕ1(X). As a result, all members of ϵ ϵ epsilon^(')\epsilon^{\prime}ϵ have the same image through ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 and ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1. In a free tree model, where decomposition is valid, a simple inductive argument shows that ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 and ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1 must coincide on f t v ( ϵ ) f t v ϵ ftv(epsilon^('))f t v\left(\epsilon^{\prime}\right)ftv(ϵ), hence - by (7) - also on Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯.
Subcase ϵ ϵ epsilon\epsilonϵ is Y ¯ = T = ϵ Y ¯ = T = ϵ bar(Y)=T=epsilon^(')\overline{\mathrm{Y}}=\mathrm{T}=\epsilon^{\prime}Y¯=T=ϵ and f t v ( T ) # X ¯ Y ¯ ( 8 ) f t v ( T ) # X ¯ Y ¯ ( 8 ) ftv(T)# bar(X) bar(Y)(8)f t v(\mathrm{~T}) \# \overline{\mathrm{X}} \overline{\mathrm{Y}}(8)ftv( T)#X¯Y¯(8). Because ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 and ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1 coincide outside X ¯ Y ¯ X ¯ Y ¯ bar(X) bar(Y)\bar{X} \bar{Y}X¯Y¯ and by (8), we have ϕ 1 ( T ) = ϕ 1 ( T ) ϕ 1 ( T ) = ϕ 1 ( T ) phi_(1)(T)=phi_(1)^(')(T)\phi_{1}(T)=\phi_{1}^{\prime}(T)ϕ1(T)=ϕ1(T). Thus, for every Y Y ¯ Y Y ¯ Y in bar(Y)Y \in \bar{Y}YY¯, we have ϕ 1 ( Y ) = ϕ 1 ( T ) = ϕ 1 ( T ) = ϕ 1 ( Y ) ϕ 1 ( Y ) = ϕ 1 ( T ) = ϕ 1 ( T ) = ϕ 1 ( Y ) phi_(1)(Y)=phi_(1)(T)=phi_(1)^(')(T)=phi_(1)^(')(Y)\phi_{1}(\mathrm{Y})=\phi_{1}(\mathrm{~T})=\phi_{1}^{\prime}(\mathrm{T})=\phi_{1}^{\prime}(\mathrm{Y})ϕ1(Y)=ϕ1( T)=ϕ1(T)=ϕ1(Y). That is, ϕ 1 ϕ 1 phi_(1)\phi_{1}ϕ1 and ϕ 1 ϕ 1 phi_(1)^(')\phi_{1}^{\prime}ϕ1 coincide on Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯.
Thanks to Lemma 1.8.7, a straightforward implementation of S-LETALL
comes to mind. The problem is, given a constraint X ¯ . U X ¯ . U EE bar(X).U\exists \overline{\mathrm{X}} . UX¯.U, where U U UUU is a standard conjunction of multi-equations, to determine the greatest subset Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ of X ¯ X ¯ bar(X)\bar{X}X¯ such that ( X ¯ Y ¯ ) . U ( X ¯ Y ¯ ) . U EE( bar(X)\\ bar(Y)).U\exists(\overline{\mathrm{X}} \backslash \overline{\mathrm{Y}}) . U(X¯Y¯).U determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯. By the first part of the lemma, it is safe for Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ to include all members of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ that are directly or indirectly dominated (with respect to U U UUU ) by some free variable of x ¯ . U x ¯ . U EE bar(x).U\exists \overline{\mathrm{x}} . Ux¯.U. Those can be found, in time linear in the size of U U UUU, by a top-down traversal of the graph of U U -<_(U)\prec_{U}U. By the second part of the lemma, it is safe to close Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ under the closure law X X Xin\mathrm{X} \inX X ¯ ( Y Y U X Y Y ¯ ) X Y ¯ X ¯ Y Y U X Y Y ¯ X Y ¯ bar(X)^^(AAYquadY-<_(U)X=>Yin bar(Y))=>Xin bar(Y)\overline{\mathrm{X}} \wedge\left(\forall \mathrm{Y} \quad \mathrm{Y} \prec_{U} \mathrm{X} \Rightarrow \mathrm{Y} \in \overline{\mathrm{Y}}\right) \Rightarrow \mathrm{X} \in \overline{\mathrm{Y}}X¯(YYUXYY¯)XY¯. That is, it is safe to also include all members of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ whose descendants (with respect to U U UUU ) have already been found to be members of Y ¯ Y ¯ bar(Y)\bar{Y}Y¯. This closure computation may be performed, again in linear time, by a bottom-up traversal of the graph of U U -<_(U)\prec_{U}U. When U U UUU is acyclic, it is possible to show that this procedure is complete, that is, does compute the greatest subset Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ that meets our requirement. This is the topic of the following exercise.
1.8.8 EXERcise [ , ] [ , ] [*********,↛][\star \star \star, \nrightarrow][,] : Assuming U U UUU is acyclic, prove that the above procedure computes the greatest subset Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ such that ( X ¯ Y ¯ ) U ( X ¯ Y ¯ ) U EE( bar(X)\\ bar(Y))*U\exists(\overline{\mathrm{X}} \backslash \overline{\mathrm{Y}}) \cdot U(X¯Y¯)U determines Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯. In the setting of a regular tree model, exhibit a satisfiable constraint U U UUU such that the above procedure is incomplete. Can you define a complete procedure in that setting?
The above discussion has shown that when Y Y Y\mathrm{Y}Y and Z Z Z\mathrm{Z}Z are equated, if Y Y Y\mathrm{Y}Y is young and Z Z Z\mathrm{Z}Z is old, then S S S\mathrm{S}S-LetAlL allows making Y Y Y\mathrm{Y}Y old as well. If binding information is encoded in terms of integer ranks, as suggested earlier, then this remark may be formulated as follows: when Y Y Y\mathrm{Y}Y and Z Z Z\mathrm{Z}Z are equated, if the rank of Y Y YYY exceeds that of Z Z ZZZ, then it may be decreased so that both ranks match. As a result, it is possible to attach ranks with multi-equations, rather than with variables. When two multi-equations are fused, the smaller rank is kept.
S-SOLVE-LET and S-NAME-2 to S-POP-LET are unnecessarily complex when x x x\mathrm{x}x is assigned a monotype T T T\mathrm{T}T, rather than an arbitrary type scheme X ¯ [ D ] X ¯ [ D ] AA bar(X)[D]\forall \overline{\mathrm{X}}[D]X¯[D].T. In that case, the combined effect of these rules may be obtained directly via the following two new rules, which may be implemented in a more efficient way:
S ; U ; let x : T in C S [ x . [ ] ] ; U x = T ; let x : x in C if x f t v ( U , T , C ) T V S ; U ; let x : x in C S [ let x : x in [ ] ] ; U ; C (S-Solve-LET-Mono) S ; U ;  let  x : T  in  C S [ x . [ ] ] ; U x = T ; let  x : x  in  C  if  x f t v ( U , T , C ) T V S ; U ;  let  x : x  in  C S [  let  x : x  in  [ ] ] ; U ; C  (S-Solve-LET-Mono)  {:[S;U;" let "x:T" in "C rarr,S[EEx.[]];U^^x=T"; let "x:x" in "C],[,," if "x!in ftv(U","T","C)^^T!inV],[,,],[S;U;" let "x:x" in "C rarr,,S[" let "x:x" in "[]];U;C quad" (S-Solve-LET-Mono) "]:}\begin{array}{rll} S ; U ; \text { let } \mathrm{x}: \mathrm{T} \text { in } C \rightarrow & S[\exists \mathrm{x} .[]] ; U \wedge \mathrm{x}=\mathrm{T} \text {; let } \mathrm{x}: \mathrm{x} \text { in } C \\ & & \text { if } \mathrm{x} \notin f t v(U, \mathrm{~T}, C) \wedge \mathrm{T} \notin \mathcal{V} \\ & & \\ S ; U ; \text { let } \mathrm{x}: \mathrm{x} \text { in } C \rightarrow & & S[\text { let } \mathrm{x}: \mathrm{x} \text { in }[]] ; U ; C \quad \text { (S-Solve-LET-Mono) } \end{array}S;U; let x:T in CS[x.[]];Ux=T; let x:x in C if xftv(U, T,C)TVS;U; let x:x in CS[ let x:x in []];U;C (S-Solve-LET-Mono) 
If T T T\mathrm{T}T isn't a variable, it is replaced with a fresh variable X X X\mathrm{X}X, together with the equation X = T X = T X=T\mathrm{X}=\mathrm{T}X=T. This corresponds to the effect of S-NAME-2. Then, we directly
create an environment frame for x x x\mathrm{x}x, without bothering to create and discard a let frame, since there is no way the type scheme X may be further simplified.
Let us now state and establish the properties of the constraint solver. First, the reduction system is terminating, so it defines an algorithm.
1.8.9 Lemma: The reduction system rarr\rightarrow is strongly normalizing.
Second, every rewriting step preserves the meaning of the constraint that the current state represents. We recall that the state S ; U ; C S ; U ; C S;U;CS ; U ; CS;U;C is meant to represent the constraint S [ U C ] S [ U C ] S[U^^C]S[U \wedge C]S[UC].
1.8.10 Lemma: S ; U ; C S ; U ; C S ; U ; C S ; U ; C S;U;C rarrS^(');U^(');C^(')S ; U ; C \rightarrow S^{\prime} ; U^{\prime} ; C^{\prime}S;U;CS;U;C implies S [ U C ] S [ U C ] S [ U C ] S U C S[U^^C]-=S^(')[U^(')^^C^(')]S[U \wedge C] \equiv S^{\prime}\left[U^{\prime} \wedge C^{\prime}\right]S[UC]S[UC].
Proof: By examination of every rule.
  • Case S-Unify. By Lemma 1.8.5.
  • Case S-Ex-1, S-Ex-2, S-Solve-Ex. By C-ExAnd.
  • Case S-Ex-3. By C-LETEx.
  • Case S-Ex-4. By C-InEx.
  • Case S-Solve-Eq, S-Pop-And. By C-Dup.
  • Case S-Solve-ID. Because σ σ sigma\sigmaσ is of the form x ¯ [ U ] . X x ¯ [ U ] . X AA bar(x)[U].X\forall \overline{\mathrm{x}}[U] . \mathrm{X}x¯[U].X, we have f p i ( σ ) = f p i ( σ ) = fpi(sigma)=O/f p i(\sigma)=\varnothingfpi(σ)=. The result follows by C-INID.
  • Case S-Solve-And. By C-AndAnd.
  • Case S-Solve-Let. By C-LetAnd.
  • Case S-NAme-2. By Definition 1.3.21 and C-NAmeEq, X f t v ( U , T ) i m X f t v ( U , T ) i m X!in ftv(U,T)im-\mathrm{X} \notin f t v(U, \mathrm{~T}) \mathrm{im}-Xftv(U, T)im plies true X ¯ [ U ] . T X ¯ X [ U X = T ] . X X ¯ [ U ] . T X ¯ X [ U X = T ] . X ⊩AA bar(X)[U].T-=AA bar(X)X[U^^X=T].X\Vdash \forall \overline{\mathrm{X}}[U] . \mathrm{T} \equiv \forall \overline{\mathrm{X}} \mathrm{X}[U \wedge \mathrm{X}=\mathrm{T}] . \mathrm{X}X¯[U].TX¯X[UX=T].X. The result follows by Lemma 1.3.22.
  • Case S-Compress. Let θ = [ Y Z ] θ = [ Y Z ] theta=[Y|->Z]\theta=[\mathrm{Y} \mapsto \mathrm{Z}]θ=[YZ]. By Definition 1.3.21 and CNAMEEQ, Y Z Y Z Y!=Z\mathrm{Y} \neq \mathrm{Z}YZ implies true X ¯ [ Y = Z = ϵ U ] . X X ¯ Y [ Y Z = X ¯ [ Y = Z = ϵ U ] . X X ¯ Y [ Y Z = ⊩AA bar(X)[Y=Z=epsilon^^U].X-=AA bar(X)Y[Y^^Z=\Vdash \forall \overline{\mathrm{X}}[\mathrm{Y}=\mathrm{Z}=\epsilon \wedge U] . \mathrm{X} \equiv \forall \overline{\mathrm{X}} \mathrm{Y}[\mathrm{Y} \wedge \mathrm{Z}=X¯[Y=Z=ϵU].XX¯Y[YZ= θ ( ϵ ) θ ( U ) ] . θ ( X ) θ ( ϵ ) θ ( U ) ] . θ ( X ) theta(epsilon)^^theta(U)].theta(X)\theta(\epsilon) \wedge \theta(U)] . \theta(\mathrm{X})θ(ϵ)θ(U)].θ(X). The result follows by Lemma 1.3.22.
  • Case S-UnName. Using Lemma 1.3.18, it is straightforward to check that Y f t v ( ϵ ) Y f t v ( ϵ ) Y!in ftv(epsilon)\mathrm{Y} \notin f t v(\epsilon)Yftv(ϵ) implies Y Y EEY\exists \mathrm{Y}Y. ( Y = ϵ ) ϵ ( Y = ϵ ) ϵ (Y=epsilon)-=epsilon(\mathrm{Y}=\epsilon) \equiv \epsilon(Y=ϵ)ϵ. The result follows by C-ExAnD and C-LETEx.
  • Case S-LetAll. By C-LetAll.
  • Case S-Pop-Let. By C-LetAnd and C-InAnd*.
  • Case S-Pop-Env. By C-IN*, recalling that σ σ EE sigma\exists \sigmaσ must be true.
Last, we classify the normal forms of the reduction system:
1.8.11 Lemma: A normal form for the reduction system rarr\rightarrow is one of (i) S ; U ; x T S ; U ; x T S;U;x-<=TS ; U ; \mathrm{x} \preceq \mathrm{T}S;U;xT, where x d p i ( S ) x d p i ( S ) x!in dpi(S)\mathrm{x} \notin d p i(S)xdpi(S); (ii) S S SSS; false; true; or (iii) X X X\mathcal{X}X; U U UUU; true, where X X X\mathcal{X}X is an existential constraint context and U U UUU a satisfiable conjunction of multi-equations.
Proof: Because, by definition, S ; U S ; U S;US ; US;U; false is not a valid state, a normal form for S-Solve-Eq, S-Solve-Id, S-Solve-And, S-Solve-Ex, and S-SolveLET must be either an instance of the left-hand side of S-SOLVE-ID, with x d p i ( S ) x d p i ( S ) x!in dpi(S)\mathrm{x} \notin d p i(S)xdpi(S), which corresponds to case (i), or of the form S S SSS; U U UUU; true. Let us consider the latter case. Because S ; U S ; U S;US ; US;U; true is a normal form with respect to S-UNIFY, by Lemma 1.8.6, U U UUU must be either false of the form X [ U ] X U X[U^(')]\mathcal{X}\left[U^{\prime}\right]X[U], where U U U^(')U^{\prime}U is a standard conjunction of multi-equations and, if the model is syntactic, U U U^(')U^{\prime}U is acyclic. The former case corresponds to (ii); thus, let us consider the latter case. Because S ; X [ U ] S ; X U S;X[U^(')]S ; \mathcal{X}\left[U^{\prime}\right]S;X[U]; true is a normal form with respect to S S S\mathrm{S}S-EX1 , the context X X X\mathcal{X}X must in fact be empty, and U U U^(')U^{\prime}U is U U UUU. If S S SSS is an existential constraint context, then we are in situation (iii). Otherwise, because S ; U S ; U S;US ; US;U; true is a normal form with respect to S-Ex-2, S-Ex-3, and S-Ex-4, the stack S S SSS does not end with an existential frame. Because S ; U S ; U S;US ; US;U; true is a normal form with respect to S-Pop-And and S-Pop-Env, S S SSS must then be of the form S [ S [ S^(')[S^{\prime}[S[ let x : X ¯ [ [ I ] T x : X ¯ [ [ I ] T x:AA bar(X)[[I]T\mathrm{x}: \forall \overline{\mathrm{X}}[\mathrm{[I}] \mathrm{T}x:X¯[[I]T in C ] C ] C]C]C]. Because S ; U S ; U S;US ; US;U; true is a normal form with respect to S-NAME-2, T must be a type variable X X X\mathrm{X}X. Let us write U U UUU as U 1 U 2 U 1 U 2 U_(1)^^U_(2)U_{1} \wedge U_{2}U1U2, where X ¯ # ftv ( U 1 ) X ¯ # ftv U 1 bar(X)#ftv(U_(1))\overline{\mathrm{X}} \# \operatorname{ftv}\left(U_{1}\right)X¯#ftv(U1), and where U 1 U 1 U_(1)U_{1}U1 is maximal for this criterion. Then, consider a multi-equation ϵ U ϵ U epsilon in U\epsilon \in UϵU. By the first part of Lemma 1.8.7, if one variable member of ϵ ϵ epsilon\epsilonϵ is free (that is, outside X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ ), then X ¯ . U X ¯ . U EE bar(X).U\exists \overline{\mathrm{X}} . UX¯.U determines all other variables in f t v ( ϵ ) f t v ( ϵ ) ftv(epsilon)f t v(\epsilon)ftv(ϵ). Because S ; U S ; U S;US ; US;U; true is a normal form with respect to S-LETALL, all variables in f t v ( ϵ ) f t v ( ϵ ) ftv(epsilon)f t v(\epsilon)ftv(ϵ) must then be free (that is, outside X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ ). By definition of U 1 U 1 U_(1)U_{1}U1, this implies ϵ U 1 ϵ U 1 epsilon inU_(1)\epsilon \in U_{1}ϵU1. By contraposition, for every multi-equation ϵ U 2 ϵ U 2 epsilon inU_(2)\epsilon \in U_{2}ϵU2, all variable members of ϵ ϵ epsilon\epsilonϵ are in X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. Furthermore, let us recall that U 2 U 2 U_(2)U_{2}U2 is a standard conjunction of multi-equations and, if the model is syntactic, U 2 U 2 U_(2)U_{2}U2 is acyclic. We let the reader check that this implies x ¯ . U 2 x ¯ . U 2 EE bar(x).U_(2)-=\exists \overline{\mathrm{x}} . U_{2} \equivx¯.U2 true; the proof is a slight generalization of the last part of that of Lemma 1.8.6. Then, S ; U S ; U S;US ; US;U; true is reducible via S-PoP-LET. This is a contradiction, so this last case cannot arise.
In case (i), the constraint S [ U C ] S [ U C ] S[U^^C]S[U \wedge C]S[UC] has a free program identifier x, so it is not satisfiable. In other words, the source program contains an unbound program identifier. Such an error could of course be detected prior to constraint solving, if desired. In case (ii), the unification algorithm failed. By Lemma 1.3.30, the constraint S [ U C ] S [ U C ] S[U^^C]S[U \wedge C]S[UC] is then false. In case (iii), the constraint S [ U C ] S [ U C ] S[U^^C]S[U \wedge C]S[UC] is equivalent to X [ U ] X [ U ] X[U]\mathcal{X}[U]X[U], where U U UUU is satisfiable, so it is satisfiable as well. Thus, each of the three classes of normal forms may be immediately identified as denoting success or failure. Thus, Lemmas 1.8.10 and 1.8.11 indeed prove that the algorithm is a constraint solver.

1.9 From ML-the-calculus to ML-the-programming-language

In this section, we explain how to extend the framework developed so far to accommodate operations on values of base type (such as integers), pairs, sums, references, and recursive function definitions. Then, we describe more complex extensions, namely algebraic data type definitions, pattern matching, and type annotations. Last, the issues associated with recursive types are briefly discussed. Exceptions are not discussed; the reader is referred to (TAPL Chapter 14).

Simple extensions

Many features of ML-the-programming-language may be introduced into MLthe-calculus by introducing new constants and extending δ δ rarr"delta"\xrightarrow{\delta}δ and Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 appropriately. In each case, it is necessary to check that the requirements of Definition 1.7 .6 are met, that is, the new initial environment faithfully reflects the nature of the new constants as well as the behavior of the new reduction rules. Below, we describe several such extensions in isolation.
1.9.1 Exercise [Integers, Recommended, ******\star \star ]: Integer literals and integer addition have been introduced and given an operational semantics in Examples 1.2.1, 1.2.2 and 1.2.4. Let us now introduce an isolated type constructor int of signature ***\star and extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the bindings n ^ n ^ hat(n)\hat{n}n^ : int, for every integer n n nnn, and + ^ : + ^ : hat(+):\hat{+}:+^: int rarr\rightarrow int rarr\rightarrow int. Check that these definitions meet the requirements of Definition 1.7.6.
1.9.2 Exercise [Booleans, Recommended, , , ******,↛\star \star, \nrightarrow, ]: Booleans and conditionals have been introduced and given an operational semantics in Exercise 1.2.6. Introduce an isolated type constructor bool to represent Boolean values and explain how to extend the initial environment. Check that your definitions meet the requirements of Definition 1.7.6. What is the constraint generation rule for the syntactic sugar if t 0 t 0 t_(0)t_{0}t0 then t 1 t 1 t_(1)t_{1}t1 else t 2 t 2 t_(2)t_{2}t2 ?
1.9.3 ExERcISE [PaIrs, , ] , ] ******,↛]\star \star, \nrightarrow],] : Pairs and pair projections have been introduced and given an operational semantics in Examples 1.2.3 and 1.2.5. Let us now introduce an isolated type constructor × × xx\times× of signature ***ox***=>***\star \otimes \star \Rightarrow \star, covariant in both of its parameters, and extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the following bindings:
( , ) : X Y . X Y X × Y π 1 : X Y . X × Y X π 2 : X Y . X × Y Y ( , ) : X Y . X Y X × Y π 1 : X Y . X × Y X π 2 : X Y . X × Y Y {:[(*","*):AAXY.XrarrYrarrXxxY],[pi_(1):AAXY.XxxYrarrX],[pi_(2):AAXY.XxxYrarrY]:}\begin{aligned} (\cdot, \cdot): & \forall \mathrm{XY} . \mathrm{X} \rightarrow \mathrm{Y} \rightarrow \mathrm{X} \times \mathrm{Y} \\ \pi_{1}: & \forall \mathrm{XY} . \mathrm{X} \times \mathrm{Y} \rightarrow \mathrm{X} \\ \pi_{2}: & \forall \mathrm{XY} . \mathrm{X} \times \mathrm{Y} \rightarrow \mathrm{Y} \end{aligned}(,):XY.XYX×Yπ1:XY.X×YXπ2:XY.X×YY
Check that these definitions meet the requirements of Definition 1.7.6.
1.9.4 ExERcise [Sums, , , ******,↛\star \star, \nrightarrow, ]: Sums have been introduced and given an operational semantics in Example 1.2.7. Let us now introduce an isolated type constructor + of signature ***ox***=>***\star \otimes \star \Rightarrow \star, covariant in both of its parameters, and extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the following bindings:
inj 1 : X Y . X X + Y i n j 2 : X Y . Y X + Y case : X Y Z . ( X + Y ) ( X Z ) ( Y Z ) Z inj 1 : X Y . X X + Y i n j 2 : X Y . Y X + Y  case  : X Y Z . ( X + Y ) ( X Z ) ( Y Z ) Z {:[inj_(1):AA XY.X rarrX+Y],[inj_(2):AAXY.YrarrX+Y],[" case ":AAXYZ.(X+Y)rarr(XrarrZ)rarr(YrarrZ)rarrZ]:}\begin{aligned} \operatorname{inj}_{1}: & \forall X Y . X \rightarrow \mathrm{X}+\mathrm{Y} \\ \mathrm{inj}_{2}: & \forall \mathrm{XY.Y} \rightarrow \mathrm{X}+\mathrm{Y} \\ \text { case }: & \forall \mathrm{XYZ} .(\mathrm{X}+\mathrm{Y}) \rightarrow(\mathrm{X} \rightarrow \mathrm{Z}) \rightarrow(\mathrm{Y} \rightarrow \mathrm{Z}) \rightarrow \mathrm{Z} \end{aligned}inj1:XY.XX+Yinj2:XY.YX+Y case :XYZ.(X+Y)(XZ)(YZ)Z
Check that these definitions meet the requirements of Definition 1.7.6.
1.9.5 EXERcise [REfEREnces, *********\star \star \star ]: References have been introduced and given an operational semantics in Example 1.2.9. The type constructor ref has been introduced in Definition 1.7.4. Let us now extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the following bindings:
ref : X . X ref X ! : X . ref X X :=: X . ref X X X  ref  : X . X ref X ! : X . ref X X :=: X . ref X X X {:[" ref ":AA X.X rarr ref X],[!:AA X.ref X rarr X],[:=:AA X.ref X rarr X rarr X]:}\begin{aligned} \text { ref }: & \forall X . X \rightarrow \operatorname{ref} X \\ !: & \forall X . \operatorname{ref} X \rightarrow X \\ :=: & \forall X . \operatorname{ref} X \rightarrow X \rightarrow X \end{aligned} ref :X.XrefX!:X.refXX:=:X.refXXX
Check that these definitions meet the requirements of Definition 1.7.6.
1.9.6 Exercise [Recursion, Recommended, *********\star \star \star ]: The fixpoint combinator fix has been introduced and given an operational semantics in Example 1.2.10. Let us now extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the following binding:
fix : X Y . ( ( X Y ) ( X Y ) ) X Y  fix :  X Y . ( ( X Y ) ( X Y ) ) X Y " fix : "quad AA XY.((X rarr Y)rarr(X rarr Y))rarr X rarr Y\text { fix : } \quad \forall X Y .((X \rightarrow Y) \rightarrow(X \rightarrow Y)) \rightarrow X \rightarrow Y fix : XY.((XY)(XY))XY
Check that these definitions meet the requirements of Definition 1.7.6. Recall how the letrec syntactic sugar was defined in Example 1.2.10, and check that this gives rise to the following constraint generation rule:
let Γ 0 in [ [ letrec f = λ z . t 1 in t 2 : T ] ] let Γ 0 in let f : X Y [ let f : X Y ; z : X in [ [ t 1 : Y ] ] ] X Y in [ [ t 2 : T ] ]  let  Γ 0  in  [ [  letrec  f = λ z . t 1  in  t 2 : T ] ]  let  Γ 0  in let  f : X Y  let  f : X Y ; z : X  in  [ [ t 1 : Y ] ] X Y  in  [ [ t 2 : T ] ] {:[" let "Gamma_(0)" in "[[" letrec "f=lambda z.t_(1)" in "t_(2):T]]],[-=" let "Gamma_(0)" in let "f:AAXY[" let "f:XrarrY;z:X" in "([[)t_(1):Y(]])]XrarrY" in "[[t_(2):T]]]:}\begin{aligned} & \text { let } \Gamma_{0} \text { in } \llbracket \text { letrec } f=\lambda z . \mathrm{t}_{1} \text { in } \mathrm{t}_{2}: \mathrm{T} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in let } \mathrm{f}: \forall \mathrm{XY}\left[\text { let } \mathrm{f}: \mathrm{X} \rightarrow \mathrm{Y} ; \mathrm{z}: \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{Y} \rrbracket\right] \mathrm{X} \rightarrow \mathrm{Y} \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \end{aligned} let Γ0 in [[ letrec f=λz.t1 in t2:T]] let Γ0 in let f:XY[ let f:XY;z:X in [[t1:Y]]]XY in [[t2:T]]
Note the somewhat peculiar structure of this constraint: the program variable f f fff is bound twice in it, with different type schemes. The constraint requires all occurrences of f f fff within t 1 t 1 t_(1)t_{1}t1 to be assigned the monomorphic type X Y X Y XrarrY\mathrm{X} \rightarrow \mathrm{Y}XY. This type is generalized and turned into a type scheme before inspecting t 2 t 2 t_(2)t_{2}t2, however, so every occurrence of f f fff within t 2 t 2 t_(2)t_{2}t2 may receive a different type, as usual with let-polymorphism. A more powerful way of typechecking recursive function definitions is discussed in Section 1.10 (page 113).

Algebraic data types

Exercises 1.9.3 and 1.9.4 have shown how to extend the language with binary, anonymous products and sums. These constructs are quite general, but still have several shortcomings. First, they are only binary, while we would like to have k k kkk-ary products and sums, for arbitrary k 0 k 0 k >= 0k \geq 0k0. Such a generalization is of course straightforward. Second, more interestingly, their components must be referred to by numeric index (as in "please extract the second component of the pair"), rather than by name ("extract the component named y"). In practice, it is crucial to use names, because they make programs more readable and more robust in the face of changes. One could introduce a mechanism that allows defining names as syntactic sugar for numeric indices. That would help a little, but not much, because these names would not appear in types, which would still be made of anonymous products and sums. Third, in the absence of recursive types, products and sums do not have sufficient expressiveness to allow defining unbounded data structures, such as lists. Indeed, it is easy to see that every value whose type T T T\mathrm{T}T is composed of base types (int, bool, etc.), products, and sums must have bounded size, where the bound | T | | T | |T||\mathrm{T}||T| is a function of T. More precisely, up to a constant factor, we have \mid int | = | | = | |=||=||=| bool = 1 = 1 ∣=1\mid=1=1, | T 1 × T 2 | = 1 + | T 1 | + | T 2 | T 1 × T 2 = 1 + T 1 + T 2 |T_(1)xxT_(2)|=1+|T_(1)|+|T_(2)|\left|\mathrm{T}_{1} \times \mathrm{T}_{2}\right|=1+\left|\mathrm{T}_{1}\right|+\left|\mathrm{T}_{2}\right||T1×T2|=1+|T1|+|T2|, and | T 1 + T 2 | = 1 + max ( | T 1 | , | T 2 | ) T 1 + T 2 = 1 + max T 1 , T 2 |T_(1)+T_(2)|=1+max(|T_(1)|,|T_(2)|)\left|\mathrm{T}_{1}+\mathrm{T}_{2}\right|=1+\max \left(\left|\mathrm{T}_{1}\right|,\left|\mathrm{T}_{2}\right|\right)|T1+T2|=1+max(|T1|,|T2|). The following example describes another facet of the same problem.
1.9.7 Example: A list is either empty, or a pair of an element and another list. So, it seems natural to try and encode the type of lists as a sum of some arbitrary type (say, unit), on the one hand, and of a product of some element type and of the type of lists itself, on the other hand. With this encoding in mind, we can go ahead and write code - for instance, a function that computes the length of a list:
letrec length = λ l.case 1 ( λ . 0 ^ ) ( λ z . 1 ^ + ^ length ( π 2 z ) )  letrec length  = λ  l.case  1 λ . 0 ^ λ z . 1 ^ + ^  length  π 2 z " letrec length "=lambda" l.case "1(lambda_(-).( hat(0)))(lambda z.( hat(1))( hat(+))" length "(pi_(2)z))\text { letrec length }=\lambda \text { l.case } 1\left(\lambda_{-} . \hat{0}\right)\left(\lambda z . \hat{1} \hat{+} \text { length }\left(\pi_{2} z\right)\right) letrec length =λ l.case 1(λ.0^)(λz.1^+^ length (π2z))
We have used integers, pairs, sums, and the letrec construct introduced in the previous section. The code analyzes the list 1 using a case construct. If the left branch is taken, the list is empty, so 0 is returned. If the right branch is taken, then z z zzz becomes bound to a pair of some element and the tail of the list. The latter is obtained using the projection operator π 2 π 2 pi_(2)\pi_{2}π2. Its length is computed using a recursive call to length and incremented by 1 . This code makes perfect sense. However, applying the constraint generation and constraint solving algorithms eventually leads to an equation of the form X = Y + ( Z × X ) X = Y + ( Z × X ) X=Y+(ZxxX)\mathrm{X}=\mathrm{Y}+(\mathrm{Z} \times \mathrm{X})X=Y+(Z×X), where X X X\mathrm{X}X stands for the type of 1 . This equation accurately reflects our encoding of the type of lists. However, in a syntactic model, it has no solution, so our definition of length is ill-typed. It is possible to adopt a free
regular tree model,thus introducing equirecursive types into the system (TAPL Chapter 20); however, there are good reasons not to do so (page 106).
To work around this problem, ML-the-programming-language offers algebraic data type definitions, whose elegance lies in the fact that, while representing only a modest theoretical extension, they do solve the three problems mentioned above. An algebraic data type may be viewed as an abstract type that is declared to be isomorphic to a ( k k kkk-ary) product or sum type with named components. The type of each component is declared as well, and may refer to the algebraic data type that is being defined: thus, algebraic data types are isorecursive (TAPL Chapter 20). In order to allow sufficient flexibility when declaring the type of each component, algebraic data type definitions may be parameterized by a number of type variables. Last, in order to allow the description of complex data structures, it is necessary to allow several algebraic data types to be defined at once; the definitions may then be mutually recursive. In fact, in order to simplify this formal presentation, we assume that all algebraic data types are defined at once at the beginning of the program. This decision is of course at odds with modular programming, but will not otherwise be a problem.
In the following, D ranges over a set of data types. We assume that data types form a subset of type constructors. We require each of them to be isolated and to have a signature of the form κ κ vec(kappa)=>***\vec{\kappa} \Rightarrow \starκ. Furthermore, \ell ranges over a set L L L\mathcal{L}L of labels, which we use indifferently as data constructors and as record labels. An algebraic data type definition is either a variant type definition or a record type definition, whose respective forms are
D X i = 1 k i : T i and D X i = 1 k i : T i D X i = 1 k i : T i  and  D X i = 1 k i : T i D vec(X)~~sum_(i=1)^(k)ℓ_(i):T_(i)quad" and "quadD vec(X)~~prod_(i=1)^(k)ℓ_(i):T_(i)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \sum_{i=1}^{k} \ell_{i}: \mathrm{T}_{i} \quad \text { and } \quad \mathrm{D} \overrightarrow{\mathrm{X}} \approx \prod_{i=1}^{k} \ell_{i}: \mathrm{T}_{i}DXi=1ki:Ti and DXi=1ki:Ti
In either case, k k kkk must be nonnegative. If D D D\mathrm{D}D has signature κ κ vec(kappa)=>***\vec{\kappa} \Rightarrow \starκ, then the type variables X X vec(X)\overrightarrow{\mathrm{X}}X must have kind κ κ vec(kappa)\vec{\kappa}κ. Every T i T i T_(i)\mathrm{T}_{i}Ti must have kind ***\star. We refer to X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ as the parameters and to T T vec(T)\overrightarrow{\mathrm{T}}T (the vector formed by T 1 , , T k T 1 , , T k T_(1),dots,T_(k)\mathrm{T}_{1}, \ldots, \mathrm{T}_{k}T1,,Tk ) as the components of the definition. The parameters are bound within the components, and the definition must be closed, that is, f t v ( T ) x ¯ f t v ( T ) x ¯ ftv( vec(T))sube bar(x)f t v(\overrightarrow{\mathrm{T}}) \subseteq \overline{\mathrm{x}}ftv(T)x¯ must hold. Last, for an algebraic data type definition to be valid, the behavior of the type constructor D D D\mathrm{D}D with respect to subtyping must match its definition. This requirement is clarified below.
1.9.8 Definition: Consider an algebraic data type definition whose parameters and components are respectively X X vec(X)\vec{X}X and T T vec(T)\vec{T}T. Let X X vec(X)^(')\vec{X}^{\prime}X and T T vec(T)^(')\vec{T}^{\prime}T be their images under an arbitrary renaming. Then, D X D X T T D X D X T T D vec(X) <= D vec(X)^(')⊩ vec(T) <= vec(T)^(')\mathrm{D} \overrightarrow{\mathrm{X}} \leq \mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \Vdash \overrightarrow{\mathrm{T}} \leq \overrightarrow{\mathrm{T}}^{\prime}DXDXTT must hold.
The above requirement bears on the definition of subtyping in the model. The idea is, since D X D X D vec(X)\mathrm{D} \overrightarrow{\mathrm{X}}DX is declared to be isomorphic to (a sum or a product of)
T T vec(T)\overrightarrow{\mathrm{T}}T, whenever two types built with D D D\mathrm{D}D are comparable, their unfoldings should be comparable as well. The reverse entailment assertion is not required for type soundness, and it is sometimes useful to declare algebraic data types that do not validate it-so-called phantom types (Fluet and Pucella, 2002). Note that the requirement may always be satisfied by making the type constructor D D D\mathrm{D}D invariant in all of its parameters. Indeed, in that case, D X D X D X D X D vec(X) <= D vec(X)^(')D \vec{X} \leq D \vec{X}^{\prime}DXDX entails X = X X = X vec(X)= vec(X)^(')\vec{X}=\vec{X}^{\prime}X=X, which must entail T = T T = T vec(T)= vec(T)^(')\overrightarrow{\mathrm{T}}=\overrightarrow{\mathrm{T}}^{\prime}T=T since T T vec(T)^(')\overrightarrow{\mathrm{T}}^{\prime}T is precisely [ X X ] T X X T [ vec(X)|-> vec(X^('))] vec(T)\left[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{X}^{\prime}}\right] \overrightarrow{\mathrm{T}}[XX]T. In an equality free tree model, every type constructor is naturally invariant, so the requirement is trivially satisfied. In other settings, however, it is often possible to satisfy the requirement of Definition 1.9 .8 while assigning D D D\mathrm{D}D a less restrictive variance. The following example illustrates such a case.
1.9.9 Example: Let list be a data type of signature ***=>***\star \Rightarrow \star. Let Nil and Cons be data constructors. Then, the following is a definition of list as a variant type:
list X Σ ( N i l : unit; Cons : X × list X )  list  X Σ ( N i l :  unit; Cons  : X ×  list  X ) " list "X~~Sigma(Nil:" unit; Cons ":Xxx" list "X)\text { list } \mathrm{X} \approx \Sigma(\mathrm{Nil}: \text { unit; Cons }: \mathrm{X} \times \text { list } \mathrm{X}) list XΣ(Nil: unit; Cons :X× list X)
Because data types form a subset of type constructors, it is valid to form the type list X X X\mathrm{X}X in the right-hand side of the definition, even though we are still in the process of defining the meaning of list. In other words, data type definitions may be recursive. However, because ~~\approx is not interpreted as equality, the type list X X X\mathrm{X}X is not a recursive type: it is nothing but an application of the unary type constructor list to the type variable X X X\mathrm{X}X. To check that the definition of list satisfies the requirement of Definition 1.9.8, we must ensure that
list X list X unit unit X × list X X × list X  list  X  list  X  unit   unit  X ×  list  X X ×  list  X " list "X <= " list "X^(')⊩" unit " <= " unit "^^Xxx" list "X <= X^(')xx" list "X^(')\text { list } \mathrm{X} \leq \text { list } \mathrm{X}^{\prime} \Vdash \text { unit } \leq \text { unit } \wedge \mathrm{X} \times \text { list } \mathrm{X} \leq \mathrm{X}^{\prime} \times \text { list } \mathrm{X}^{\prime} list X list X unit  unit X× list XX× list X
holds. This assertion is equivalent to list X X X <=\mathrm{X} \leqX list X X X X X X X^(')⊩X <= X^(')\mathrm{X}^{\prime} \Vdash \mathrm{X} \leq \mathrm{X}^{\prime}XXX. To satisfy the requirement, it is sufficient to make list a covariant type constructor, that is, to define subtyping in the model so that list X X X <=\mathrm{X} \leqX list X X X X X X X^(')-=X <= X^(')\mathrm{X}^{\prime} \equiv \mathrm{X} \leq \mathrm{X}^{\prime}XXX holds.
Let tree be a data type of signature ***=>***\star \Rightarrow \star. Let root and sons be record labels. Then, the following is a definition of tree as a record type:
tree X Π ( root : X ; sons : list ( tree X ) )  tree  X Π (  root  : X ;  sons  :  list  (  tree  X ) ) " tree "X~~Pi(" root ":X;" sons ":" list "(" tree "X))\text { tree } \mathrm{X} \approx \Pi(\text { root }: \mathrm{X} ; \text { sons }: \text { list }(\text { tree } \mathrm{X})) tree XΠ( root :X; sons : list ( tree X))
This definition is again recursive, and relies on the previous definition. Because list is covariant, it is straightforward to check that the definition of tree is valid if tree is made a covariant type constructor as well.
1.9.10 EXERCISE [ , ] [ , ] [******,↛][\star \star, \nrightarrow][,] : Consider a nonrecursive algebraic data type definition, where the variance of every type constructor that appears on the right-hand side is known. Can you systematically determine, for each of the parameters, the least restrictive variance that makes the definition valid? Generalize this procedure to the case of recursive and mutually recursive algebraic data type definitions.
A prologue is a set of algebraic data type definitions, where each data type is defined at most once and where each data constructor or record label appears at most once. A program is a pair of a prologue and an expression. The effect of a prologue is to enrich the programming language with new constants. That is, a variant type definition extends the operational semantics with several injections and a case construct, as in Example 1.2.7. A record type definition extends it with a record formation construct and several projections, as in Examples 1.2.3 and 1.2.5. In either case, the initial typing environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 is extended with information about these new constants. Thus, algebraic data type definitions might be viewed as a simple configuration language that allows specifying in which instance of ML-the-calculus the expression that follows the prologue should be typechecked and interpreted. Let us now give a precise account of this phenomenon.
To begin, suppose the prologue contains the definition D X i = 1 k i : T i D X i = 1 k i : T i D vec(X)~~sum_(i=1)^(k)ℓ_(i):T_(i)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \sum_{i=1}^{k} \ell_{i}: \mathrm{T}_{i}DXi=1ki:Ti. Then, for each i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k}, a constructor of arity 1 , named i i ℓ_(i)\ell_{i}i, is introduced. Furthermore, a destructor of arity k + 1 k + 1 k+1k+1k+1, named case D D D _(D)_{D}D, is introduced. When k > 0 k > 0 k > 0k>0k>0, it is common to write case t [ i : t i ] i = 1 k t i : t i i = 1 k t[ℓ_(i):t_(i)]_(i=1)^(k)\mathrm{t}\left[\ell_{i}: \mathrm{t}_{i}\right]_{i=1}^{k}t[i:ti]i=1k for the application case D t t 1 t n D t t 1 t n _(D)tt_(1)dotst_(n)_{D} t t_{1} \ldots t_{n}Dtt1tn. The operational semantics is extended with the following reduction rules, for i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k} :
(R-ALG-CASE) case ( i v ) [ j : v j ] j = 1 k δ v i v (R-ALG-CASE) case i v j : v j j = 1 k δ v i v {:(R-ALG-CASE)case(ℓ_(i)v)[ℓ_(j):v_(j)]_(j=1)^(k)rarr"delta"v_(i)v:}\begin{equation*} \operatorname{case}\left(\ell_{i} \mathrm{v}\right)\left[\ell_{j}: \mathrm{v}_{j}\right]_{j=1}^{k} \xrightarrow{\delta} \mathrm{v}_{i} \mathrm{v} \tag{R-ALG-CASE} \end{equation*}(R-ALG-CASE)case(iv)[j:vj]j=1kδviv
For each i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k}, the initial environment is extended with the binding i : X ¯ . T i D X i : X ¯ . T i D X ℓ_(i):AA bar(X).T_(i)rarrD vec(X)\ell_{i}: \forall \overline{\mathrm{X}} . \mathrm{T}_{i} \rightarrow \mathrm{D} \overrightarrow{\mathrm{X}}i:X¯.TiDX. It is further extended with the binding case D D : X ¯ Z . D X D D : X ¯ Z . D X D_(D):AA bar(X)Z.D vec(X)rarr\mathrm{D}_{\mathrm{D}}: \forall \overline{\mathrm{X}} \mathrm{Z} . \mathrm{D} \overrightarrow{\mathrm{X}} \rightarrowDD:X¯Z.DX ( T 1 Z ) ( T k Z ) Z T 1 Z T k Z Z (T_(1)rarrZ)rarr dots(T_(k)rarrZ)rarrZ\left(\mathrm{T}_{1} \rightarrow \mathrm{Z}\right) \rightarrow \ldots\left(\mathrm{T}_{k} \rightarrow \mathrm{Z}\right) \rightarrow \mathrm{Z}(T1Z)(TkZ)Z.
Now, suppose the prologue contains the definition D X i = 1 k i : T i D X i = 1 k i : T i D vec(X)~~prod_(i=1)^(k)ℓ_(i):T_(i)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \prod_{i=1}^{k} \ell_{i}: \mathrm{T}_{i}DXi=1ki:Ti. Then, for each i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k}, a destructor of arity 1 , named i i ℓ_(i)\ell_{i}i, is introduced. Furthermore, a constructor of arity k k kkk, named make e D e D e_(D)\mathrm{e}_{\mathrm{D}}eD, is introduced. It is common to write t . t . t.ℓ\mathrm{t} . \ellt. for the application t t ℓt\ell \mathrm{t}t and, when k > 0 k > 0 k > 0k>0k>0, to write { i = t i } i = 1 k i = t i i = 1 k {ℓ_(i)=t_(i)}_(i=1)^(k)\left\{\ell_{i}=\mathrm{t}_{i}\right\}_{i=1}^{k}{i=ti}i=1k for the application make t D t k t D t k t_(D)dotst_(k)\mathrm{t}_{\mathrm{D}} \ldots \mathrm{t}_{k}tDtk. The operational semantics is extended with the following reduction rules, for i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k} :
(R-ALG-PRoJ) ( { j = v j } j = 1 k ) i δ v i (R-ALG-PRoJ) j = v j j = 1 k i δ v i {:(R-ALG-PRoJ)({ℓ_(j)=v_(j)}_(j=1)^(k))*ℓ_(i)rarr"delta"v_(i):}\begin{equation*} \left(\left\{\ell_{j}=\mathrm{v}_{j}\right\}_{j=1}^{k}\right) \cdot \ell_{i} \xrightarrow{\delta} \mathrm{v}_{i} \tag{R-ALG-PRoJ} \end{equation*}(R-ALG-PRoJ)({j=vj}j=1k)iδvi
For each i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k}, the initial environment is extended with the binding i : X ¯ . D X T i i : X ¯ . D X T i ℓ_(i):AA bar(X).D vec(X)rarrT_(i)\ell_{i}: \forall \overline{\mathrm{X}} . \mathrm{D} \overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}_{i}i:X¯.DXTi. It is further extended with the binding make D D : X ¯ . T 1 D D : X ¯ . T 1 D_(D):AA bar(X).T_(1)rarr dots rarr\mathrm{D}_{\mathrm{D}}: \forall \overline{\mathrm{X}} . \mathrm{T}_{1} \rightarrow \ldots \rightarrowDD:X¯.T1 T k D X T k D X T_(k)rarrD vec(X)\mathrm{T}_{k} \rightarrow \mathrm{D} \overrightarrow{\mathrm{X}}TkDX.
1.9.11 EXAMPLE: The effect of defining list (Example 1.9.9) is to make Nil and Cons data constructors of arity 1 and to introduce a binary destructor case list . The definition also extends the initial environment as follows:
Thus, the value Cons ( 0 ^ , N i l ( ) ) ( 0 ^ , N i l ( ) ) ( hat(0),Nil())(\hat{0}, \mathrm{Nil}())(0^,Nil()), an integer list of length 1 , has type list int. A function that computes the length of a list may now be written as follows:
letrec length = λ l . case 1 [ N i l : λ _ . 0 ^ Cons : λ z . 1 ^ + ^ length ( π 2 z ) ]  letrec length  = λ l .  case  1 N i l : λ _ . 0 ^  Cons  : λ z . 1 ^ + ^  length  π 2 z " letrec length "=lambda l." case "1[Nil:lambda _.( hat(0))∣" Cons ":lambda z.( hat(1))( hat(+))" length "(pi_(2)z)]\text { letrec length }=\lambda l . \text { case } 1\left[\mathrm{Nil}: \lambda \_. \hat{0} \mid \text { Cons }: \lambda z . \hat{1} \hat{+} \text { length }\left(\pi_{2} z\right)\right] letrec length =λl. case 1[Nil:λ_.0^ Cons :λz.1^+^ length (π2z)]
Recall that this notation is syntactic sugar for
letrec length = λ l.case list l ( λ _ . 0 ^ ) ( λ z . 1 ^ + ^ length ( π 2 z ) )  letrec length  = λ  l.case  list  l λ _ . 0 ^ λ z . 1 ^ + ^  length  π 2 z " letrec length "=lambda" l.case "_("list ")l(lambda _.( hat(0)))(lambda z.( hat(1))( hat(+))" length "(pi_(2)z))\text { letrec length }=\lambda \text { l.case }_{\text {list }} l\left(\lambda \_. \hat{0}\right)\left(\lambda z . \hat{1} \hat{+} \text { length }\left(\pi_{2} z\right)\right) letrec length =λ l.case list l(λ_.0^)(λz.1^+^ length (π2z))
The difference with the code in Example 1.9.7 appears minimal: the case construct is now annotated with the data type list. As a result, the type inference algorithm employs the type scheme assigned to case list , which is derived from the definition of list, instead of the type scheme assigned to the anonymous case construct, given in Exercise 1.9.4. This is good for a couple of reasons. First, the former is more informative than the latter, because it contains the type T i T i T_(i)\mathrm{T}_{i}Ti associated with the data constructor i i ℓ_(i)\ell_{i}i. Here, for instance, the generated constraint requires the type of z z z\mathrm{z}z to be X × X × Xxx\mathrm{X} \timesX× list X X X\mathrm{X}X for some X X X\mathrm{X}X, so a good error message would be given if a mistake was made in the second branch, such as omitting the use of π 2 π 2 pi_(2)\pi_{2}π2. Second, and more fundamentally, the code is now well-typed, even in the absence of recursive types. In Example 1.9.7, a cyclic equation was produced because case required the type of 1 to be a sum type and because a sum type carries the types of its left and right branches as subterms. Here, instead, case list requires 1 to have type list X for some X. This is an abstract type: it does not explicitly contain the types of the branches. As a result, the generated constraint no longer involves a cyclic equation. It is, in fact, satisfiable; the reader may check that length has type x x AA x\forall xx. list X X X rarrX \rightarrowX int, as expected.
Example 1.9.11 stresses the importance of using declared, abstract types, as opposed to anonymous, concrete sum or product types, in order to obviate the need for recursive types. The essence of the trick lies in the fact that the type schemes associated with operations on algebraic data types implicitly fold and unfold the data type's definition. More precisely, let us recall the type scheme assigned to the i th i th  i^("th ")i^{\text {th }}ith  injection in the setting of ( k k kkk-ary) anonymous sums: it is X 1 X k X i X 1 + + X k X 1 X k X i X 1 + + X k AAX_(1)dotsX_(k)*X_(i)rarrX_(1)+dots+X_(k)\forall \mathrm{X}_{1} \ldots \mathrm{X}_{k} \cdot \mathrm{X}_{i} \rightarrow \mathrm{X}_{1}+\ldots+\mathrm{X}_{k}X1XkXiX1++Xk, or, more concisely, x 1 X k . X i i = 1 k X i x 1 X k . X i i = 1 k X i AAx_(1)dotsX_(k).X_(i)rarrsum_(i=1)^(k)X_(i)\forall \mathrm{x}_{1} \ldots \mathrm{X}_{k} . \mathrm{X}_{i} \rightarrow \sum_{i=1}^{k} \mathrm{X}_{i}x1Xk.Xii=1kXi. By instantiating each X i X i X_(i)\mathrm{X}_{i}Xi with T i T i T_(i)\mathrm{T}_{i}Ti and generalizing again, we find that a more specific type scheme is X ¯ . T i i = 1 k T i X ¯ . T i i = 1 k T i AA bar(X).T_(i)rarrsum_(i=1)^(k)T_(i)\forall \overline{\mathrm{X}} . \mathrm{T}_{i} \rightarrow \sum_{i=1}^{k} \mathrm{~T}_{i}X¯.Tii=1k Ti. Perhaps this could have been the type scheme assigned to i i ℓ_(i)\ell_{i}i ? Instead, however, it is X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯. T i D X T i D X T_(i)rarrD vec(X)\mathrm{T}_{i} \rightarrow \mathrm{D} \overrightarrow{\mathrm{X}}TiDX. We now realize that this type scheme not only reflects the operational behavior of the i th i th  i^("th ")i^{\text {th }}ith  injection, but also folds the definition of the algebraic data type D D D\mathrm{D}D by turning the anonymous sum i = 1 k T i i = 1 k T i sum_(i=1)^(k)T_(i)\sum_{i=1}^{k} \mathrm{~T}_{i}i=1k Ti-which forms the definition's right-hand side-into the parameterized abstract type D X D X D vec(X)D \vec{X}DX - which is the definition's left-hand side. Conversely, the type scheme assigned to case D D _(D)_{D}D unfolds the definition. The
situation is identical in the case of record types: in either case, constructors fold, destructors unfold. In other words, occurrences of data constructors and record labels in the code may be viewed as explicit instructions for the typechecker to fold or unfold an algebraic data type definition. This mechanism is characteristic of isorecursive types.
1.9.12 EXERCISE [ , ] [ , ] [***,↛][\star, \nrightarrow][,] : For a fixed k k kkk, check that all of the machinery associated with k k kkk-ary anonymous products - that is, constructors, destructors, reduction rules, and extensions to the initial typing environment-may be viewed as the result of a single algebraic data type definition. Conduct a similar check in the case of k k kkk-ary anonymous sums.
1.9.13 EXERCISE [ , ] [ , ] [*********,↛][\star \star \star, \nrightarrow][,] : Check that the above definitions meet the requirements of Definition 1.7.6.
1.9.14 EXERCISE [ , ] [ , ] [*********,↛][\star \star \star, \nrightarrow][,] : For sake of simplicity, we have assumed that data constructors are always of arity one. It is indeed possible to allow data constructors of any arity and define variants as D X i = 1 k i : T i D X i = 1 k i : T i D vec(X)~~sum_(i=1)^(k)ℓ_(i): vec(T)_(i)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \sum_{i=1}^{k} \ell_{i}: \overrightarrow{\mathrm{T}}_{i}DXi=1ki:Ti. For instance, the definition of list could then be list X Σ ( N i l X Σ ( N i l X~~Sigma(Nil\mathrm{X} \approx \Sigma(\mathrm{Nil}XΣ(Nil; Cons : X × : X × :Xxx: \mathrm{X} \times:X× list X ) X ) X)\mathrm{X})X) and for instance Cons ( 0 ^ , N i l ) Cons ( 0 ^ , N i l ) Cons( hat(0),Nil)\operatorname{Cons}(\hat{0}, \mathrm{Nil})Cons(0^,Nil) would be a list value. Make the necessary changes in the definitions above and check that they still meet the requirements of Definition 1.7.6.
In this formal presentation of algebraic data types, we have assumed that all algebraic data type definitions are known before the program is typechecked. This simplifying assumption is forced on us by the fact that we interpret constraints in a fixed model, that is, we assume a fixed universe of types. In practice, programming languages have module systems, which allow distinct modules to have distinct, partial views of the universe of types. Then, it becomes possible for each module to come with its own data type definitions. Interestingly, it is even possible, in principle, to split the definition of a single data type over several modules, yielding extensible algebraic data types. For instance, module A A AAA might declare the existence of a parameterized variant type D x D x D vec(x)\mathrm{D} \overrightarrow{\mathrm{x}}Dx, without giving its components. Later on, module B B BBB might define a component : T : T ℓ:T\ell: \mathrm{T}:T, where f t v ( T ) X ¯ f t v ( T ) X ¯ ftv(T)sube bar(X)f t v(\mathrm{~T}) \subseteq \overline{\mathrm{X}}ftv( T)X¯. Such a definition makes \ell a unary constructor with type scheme X ¯ . T D X X ¯ . T D X AA bar(X).TrarrD vec(X)\forall \overline{\mathrm{X}} . \mathrm{T} \rightarrow \mathrm{D} \overrightarrow{\mathrm{X}}X¯.TDX, as before. It becomes impossible, however, to introduce a destructor c a s e D c a s e D case_(D)\mathrm{case}_{\mathrm{D}}caseD, because the definition of an extensible variant type can never be assumed to be complete-other, unknown modules might extend it further. To compensate for its absence, one may supplement every constructor \ell with a destructor 1 1 ℓ^(-1)\ell^{-1}1, whose semantics is given by 1 ( v ) v 1 v 2 δ v 1 v 1 ( v ) v 1 v 2 δ v 1 v ℓ^(-1)(ℓv)v_(1)v_(2)rarr"delta"v_(1)v\ell^{-1}(\ell \mathrm{v}) \mathrm{v}_{1} \mathrm{v}_{2} \xrightarrow{\delta} \mathrm{v}_{1} \mathrm{v}1(v)v1v2δv1v and 1 ( v ) v 1 v 2 δ v 2 ( v ) 1 v v 1 v 2 δ v 2 v ℓ^(-1)(ℓ^(')v)v_(1)v_(2)rarr"delta"v_(2)(ℓ^(')v)\ell^{-1}\left(\ell^{\prime} \mathrm{v}\right) \mathrm{v}_{1} \mathrm{v}_{2} \xrightarrow{\delta} \mathrm{v}_{2}\left(\ell^{\prime} \mathrm{v}\right)1(v)v1v2δv2(v) when ℓ!=ℓ^(')\ell \neq \ell^{\prime}, and whose type scheme is x ¯ Z . D X ( T Z ) ( D X Z ) Z x ¯ Z . D X ( T Z ) ( D X Z ) Z AA bar(x)Z.D vec(X)rarr(TrarrZ)rarr(D vec(X)rarrZ)rarrZ\forall \overline{\mathrm{x}} \mathrm{Z} . \mathrm{D} \overrightarrow{\mathrm{X}} \rightarrow(\mathrm{T} \rightarrow \mathrm{Z}) \rightarrow(\mathrm{D} \overrightarrow{\mathrm{X}} \rightarrow \mathrm{Z}) \rightarrow \mathrm{Z}x¯Z.DX(TZ)(DXZ)Z. When
pattern matching is available, 1 1 ℓ^(-1)\ell^{-1}1 may in fact be defined in the language. MLthe-programming-language does not offer extensible algebraic data types as a language feature, but does have one built-in extensible variant type, namely the type exn of exceptions. Thus, it is possible to define new constructors for the type exn within any module. The price of this extra flexibility is that no exhaustive case analysis on values of type exn is possible.
One significant drawback of algebraic data type definitions resides in the fact that a label \ell cannot be shared by two distinct variant or record type definitions. Indeed, every algebraic data type definition extends the calculus with new constants. Strictly speaking, our presentation does not allow a single constant c c c\mathrm{c}c to be associated with two distinct definitions. Even if we did allow such a collision, the initial environment would contain two bindings for c, one of which would then become inaccessible. This phenomenon arises in actual implementations of ML-the-programming-language, where a new algebraic data type definition may hide some of the data constructors or record labels introduced by a previous definition. An elegant solution to this lack of expressiveness is discussed in Section 1.11.

Pattern matching

Our presentation of products, sums and algebraic data types has remained within the setting of ML-the-calculus: that is, data structures have been built out of constructors, while the case analysis and record access operations have been viewed as destructors. Some syntactic sugar has been used to recover standard notations. The language is now expressive enough to allow defining and manipulating complex data structures, such as lists and trees. Yet, experience shows that programming in such a language is still somewhat cumbersome. Indeed, case analysis and record access are low-level operations: the former allows inspecting a tag and branching, while the latter allows dereferencing a pointer. In practice, one often needs to carry out more complex tasks, such as determining whether a data structure has a certain shape or whether two data structures have comparable shapes. Currently, the only way to carry out these tasks is to program an explicit sequence of low-level operations. It would be much preferable to extend the language so that it becomes directly possible to describe shapes, called patterns, and so that checking whether a patterns matches a value becomes an elementary operation. ML-the-programming-language offers this feature, called pattern matching. Although pattern matching may be added to ML-the-calculus by introducing a family of destructors, we rather choose to extend the calculus with a new match construct, which subsumes the existing let construct. This approach appears somewhat simpler and more powerful. We now carry out this
Figure 1-13: Patterns and pattern matching
extension.
Let us first define the syntax of patterns (Figure 1-13) and describe (informally, for now) which values they match. To a pattern p p p\mathrm{p}p, we associate a set of defined program variables d p i ( p ) d p i ( p ) dpi(p)d p i(\mathrm{p})dpi(p), whose definition appears in the text that follows. The pattern p p p\mathrm{p}p is well-formed if and only if d p i ( p ) d p i ( p ) dpi(p)d p i(\mathrm{p})dpi(p) is defined. To begin, the wildcard _ is a pattern, which matches every value and binds no variables. We let d p i ( _ ) = d p i _ = dpi(_)=O/d p i\left(\_\right)=\varnothingdpi(_)=. Although the wildcard may be viewed as an anonymous variable, and we have done so thus far, it is now simpler to view it as a distinct pattern. A program variable z z zzz is also a pattern, which matches every value and binds z z z\mathbf{z}z to the matched value. We let d p i ( z ) = { z } d p i ( z ) = { z } dpi(z)={z}d p i(\mathbf{z})=\{\mathbf{z}\}dpi(z)={z}. Next, if c c c\mathrm{c}c is a constructor of arity k k kkk, then c p 1 p k c p 1 p k cp_(1)dotsp_(k)\mathrm{c} \mathrm{p}_{1} \ldots \mathrm{p}_{k}cp1pk is a pattern, which matches c v 1 v k c v 1 v k cv_(1)dotsv_(k)\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k}cv1vk when p i p i p_(i)\mathrm{p}_{i}pi matches v i v i v_(i)\mathrm{v}_{i}vi for every i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k}. We let d p i ( c 1 p k ) = d p i ( p 1 ) d p i ( p k ) d p i c 1 p k = d p i p 1 d p i p k dpi(c_(1)dotsp_(k))=dpi(p_(1))⊎dots⊎dpi(p_(k))d p i\left(\mathrm{c}_{1} \ldots \mathrm{p}_{k}\right)=d p i\left(\mathrm{p}_{1}\right) \uplus \ldots \uplus d p i\left(\mathrm{p}_{k}\right)dpi(c1pk)=dpi(p1)dpi(pk). That is, the pattern c p 1 p k c p 1 p k cp_(1)dotsp_(k)\mathrm{c} \mathrm{p}_{1} \ldots \mathrm{p}_{k}cp1pk is well-formed when p 1 , , p k p 1 , , p k p_(1),dots,p_(k)\mathrm{p}_{1}, \ldots, \mathrm{p}_{k}p1,,pk define disjoint sets of variables. This condition rules out nonlinear patterns such as ( z , z ) ( z , z ) (z,z)(z, z)(z,z). Defining the semantics of such a pattern would require a notion of equality at every type, which introduces various complications, so it is commonly considered ill-formed. The pattern p 1 p 2 p 1 p 2 p_(1)^^p_(2)\mathrm{p}_{1} \wedge \mathrm{p}_{2}p1p2 matches all values that both p 1 p 1 p_(1)\mathrm{p}_{1}p1 and p 2 p 2 p_(2)\mathrm{p}_{2}p2 match. It is commonly used with p 2 p 2 p_(2)\mathrm{p}_{2}p2 a program variable: then, it allows examining the shape of a value and binding a name to it at the same time. Again, we define d p i ( p 1 p 2 ) = d p i ( p 1 ) d p i ( p 2 ) d p i p 1 p 2 = d p i p 1 d p i p 2 dpi(p_(1)^^p_(2))=dpi(p_(1))⊎dpi(p_(2))d p i\left(\mathrm{p}_{1} \wedge \mathrm{p}_{2}\right)=d p i\left(\mathrm{p}_{1}\right) \uplus d p i\left(\mathrm{p}_{2}\right)dpi(p1p2)=dpi(p1)dpi(p2). The pattern p 1 p 2 p 1 p 2 p_(1)vvp_(2)\mathrm{p}_{1} \vee \mathrm{p}_{2}p1p2 matches all values that either p 1 p 1 p_(1)\mathrm{p}_{1}p1 or p 2 p 2 p_(2)\mathrm{p}_{2}p2 matches. We define d p i ( p 1 p 2 ) = d p i ( p 1 ) = d p i ( p 2 ) d p i p 1 p 2 = d p i p 1 = d p i p 2 dpi(p_(1)vvp_(2))=dpi(p_(1))=dpi(p_(2))d p i\left(\mathrm{p}_{1} \vee \mathrm{p}_{2}\right)=d p i\left(\mathrm{p}_{1}\right)=d p i\left(\mathrm{p}_{2}\right)dpi(p1p2)=dpi(p1)=dpi(p2). That is, the pattern p 1 p 2 p 1 p 2 p_(1)vvp_(2)\mathrm{p}_{1} \vee \mathrm{p}_{2}p1p2 is well-formed when p 1 p 1 p_(1)p_{1}p1 and p 2 p 2 p_(2)p_{2}p2 define the same variables. Thus, ( i n j 1 z ) ( i n j 2 z ) i n j 1 z i n j 2 z (inj_(1)z)vv(inj_(2)z)\left(i n j_{1} z\right) \vee\left(i n j_{2} z\right)(inj1z)(inj2z) is a wellformed pattern, which binds z z zzz to the component of a binary sum, without regard for its tag. However, ( i n j 1 z 1 ) ( i n j 2 z 2 ) i n j 1 z 1 i n j 2 z 2 (inj_(1)z_(1))vv(inj_(2)z_(2))\left(i n j_{1} z_{1}\right) \vee\left(i n j_{2} z_{2}\right)(inj1z1)(inj2z2) is ill-formed, because one cannot statically predict whether it defines z 1 z 1 z_(1)z_{1}z1 or z 2 z 2 z_(2)z_{2}z2.
Let us now formally define whether a pattern p p p\mathrm{p}p matches a value v v v\mathrm{v}v and how the variables in d p i ( p ) d p i ( p ) dpi(p)d p i(\mathrm{p})dpi(p) become bound to values in the process. This is done by introducing a generalized substitution, written [ p v ] [ p v ] [p|->v][\mathrm{p} \mapsto \mathrm{v}][pv], which is either
undefined or a substitution of values for the program variables in d p i ( p ) d p i ( p ) dpi(p)d p i(\mathrm{p})dpi(p). If the former, then p p p\mathrm{p}p does not match v v v\mathrm{v}v. If the latter, then p p p\mathrm{p}p matches v v v\mathrm{v}v and, for every z d p i ( p ) z d p i ( p ) zin dpi(p)\mathrm{z} \in d p i(\mathrm{p})zdpi(p), the variable z z z\mathrm{z}z becomes bound to the value [ p v ] z [ p v ] z [p|->v]z[\mathrm{p} \mapsto \mathrm{v}] \mathrm{z}[pv]z. Of course, when p p p\mathrm{p}p is a variable z z z\mathrm{z}z, the generalized substitution [ z v ] [ z v ] [z|->v][\mathrm{z} \mapsto \mathrm{v}][zv] is defined and coincides with the substitution [ z v ] [ z v ] [z|->v][\mathrm{z} \mapsto \mathrm{v}][zv], which justifies our abuse of notation. To construct generalized substitutions, we use two simple combinators. First, when dpi ( p 1 ) dpi p 1 dpi(p_(1))\operatorname{dpi}\left(\mathrm{p}_{1}\right)dpi(p1) and dpi ( p 2 ) dpi p 2 dpi(p_(2))\operatorname{dpi}\left(\mathrm{p}_{2}\right)dpi(p2) are disjoint, [ p 1 v 1 ] [ p 2 v 2 ] p 1 v 1 p 2 v 2 [p_(1)|->v_(1)]ox[p_(2)|->v_(2)]\left[\mathrm{p}_{1} \mapsto \mathrm{v}_{1}\right] \otimes\left[\mathrm{p}_{2} \mapsto \mathrm{v}_{2}\right][p1v1][p2v2] stands for the set-theoretic union of [ p 1 v 1 ] p 1 v 1 [p_(1)|->v_(1)]\left[\mathrm{p}_{1} \mapsto \mathrm{v}_{1}\right][p1v1] and [ p 2 v 2 ] p 2 v 2 [p_(2)|->v_(2)]\left[\mathrm{p}_{2} \mapsto \mathrm{v}_{2}\right][p2v2], if both are defined, and is undefined otherwise. We use this combinator to ensure that p 1 p 1 p_(1)\mathrm{p}_{1}p1 matches v 1 v 1 v_(1)\mathrm{v}_{1}v1 and p 2 p 2 p_(2)\mathrm{p}_{2}p2 matches v 2 v 2 v_(2)\mathrm{v}_{2}v2 and to combine the two corresponding sets of bindings. Second, when o 1 o 1 o_(1)o_{1}o1 and o 2 o 2 o_(2)o_{2}o2 are two possibly undefined mathematical objects that belong to the same space when defined, o 1 o 2 o 1 o 2 o_(1)o+o_(2)o_{1} \oplus o_{2}o1o2 stands for o 1 o 1 o_(1)o_{1}o1, if it is defined, and for o 2 o 2 o_(2)o_{2}o2 otherwise - that is, o+\oplus is an angelic choice operator with a left bias. In particular, when d p i ( p 1 ) d p i p 1 dpi(p_(1))d p i\left(\mathrm{p}_{1}\right)dpi(p1) and d p i ( p 2 ) d p i p 2 dpi(p_(2))d p i\left(\mathrm{p}_{2}\right)dpi(p2) coincide, [ p 1 v 1 ] [ p 2 v 2 ] p 1 v 1 p 2 v 2 [p_(1)|->v_(1)]o+[p_(2)|->v_(2)]\left[\mathrm{p}_{1} \mapsto \mathrm{v}_{1}\right] \oplus\left[\mathrm{p}_{2} \mapsto \mathrm{v}_{2}\right][p1v1][p2v2] stands for [ p 1 v 1 ] p 1 v 1 [p_(1)|->v_(1)]\left[\mathrm{p}_{1} \mapsto \mathrm{v}_{1}\right][p1v1], if it is defined, and for [ p 2 v 2 ] p 2 v 2 [p_(2)|->v_(2)]\left[\mathrm{p}_{2} \mapsto \mathrm{v}_{2}\right][p2v2] otherwise. We use this combinator to ensure that p 1 p 1 p_(1)\mathrm{p}_{1}p1 matches v 1 v 1 v_(1)\mathrm{v}_{1}v1 or p 2 p 2 p_(2)\mathrm{p}_{2}p2 matches v 2 v 2 v_(2)\mathrm{v}_{2}v2 and to retain the corresponding set of bindings. The full definition of generalized substitutions, which relies on these combinators, appears in Figure 1-13. It reflects the informal presentation of the previous paragraph.
Once patterns and pattern matching are defined, it is straightforward to extend the syntax and operational semantics of ML-the-calculus. We enrich the syntax of expressions with a new construct, match t t t\mathrm{t}t with ( p i t i ) i = 1 k p i t i i = 1 k (p_(i)*t_(i))_(i=1)^(k)\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}(piti)i=1k, where k 1 k 1 k >= 1k \geq 1k1. It consists of a term t t ttt and a nonempty, ordered list of clauses, each of which is composed of a pattern p i p i p_(i)\mathrm{p}_{i}pi and a term t i t i t_(i)\mathrm{t}_{i}ti. The syntax of evaluation contexts is extended as well, so that the term t t ttt that is being examined is first reduced to a value v v v\mathrm{v}v. The operational semantics is extended with a new rule, R M A T C H R M A T C H R-MATCH\mathrm{R}-\mathrm{MATCH}RMATCH, which states that match v v v\mathrm{v}v with ( p i t i ) i = 1 k p i t i i = 1 k (p_(i)*t_(i))_(i=1)^(k)\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}(piti)i=1k reduces to [ p i v ] t i p i v t i [p_(i)|->v]t_(i)\left[\mathrm{p}_{i} \mapsto \mathrm{v}\right] \mathrm{t}_{i}[piv]ti, where i i iii is the least element of { 1 , , k } { 1 , , k } {1,dots,k}\{1, \ldots, k\}{1,,k} such that p i p i p_(i)\mathrm{p}_{i}pi matches v i v i v_(i)\mathrm{v}_{i}vi. Technically, i = 1 k [ p i v ] t i i = 1 k p i v t i bigoplus_(i=1)^(k)[p_(i)|->v]t_(i)\bigoplus_{i=1}^{k}\left[\mathrm{p}_{i} \mapsto \mathrm{v}\right] \mathrm{t}_{i}i=1k[piv]ti stands for [ p 1 v ] t 1 [ p k v ] t k p 1 v t 1 p k v t k [p_(1)|->v]t_(1)o+dots o+[p_(k)|->v]t_(k)\left[\mathrm{p}_{1} \mapsto \mathrm{v}\right] \mathrm{t}_{1} \oplus \ldots \oplus\left[\mathrm{p}_{k} \mapsto \mathrm{v}\right] \mathrm{t}_{k}[p1v]t1[pkv]tk, so that the reduct is the first term that is defined in this sequence.
As far as semantics is concerned, the match construct may be viewed as a
generalization of the let construct. Indeed, let z = t 1 z = t 1 z=t_(1)z=t_{1}z=t1 in t 2 t 2 t_(2)t_{2}t2 may now be viewed as syntactic sugar for match t 1 t 1 t_(1)t_{1}t1 with z . t 2 z . t 2 z.t_(2)z . t_{2}z.t2, that is, a match construct with a single clause and a variable pattern. Then, R-LET becomes a special case of R-MATCH.
It is pleasant to introduce some more syntactic sugar. We write λ ( p i t i ) i = 1 k λ p i t i i = 1 k lambda(p_(i)*t_(i))_(i=1)^(k)\lambda\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}λ(piti)i=1k for λ λ lambda\lambdaλ z.match z z z\mathrm{z}z with ( p i t i ) i = 1 k p i t i i = 1 k (p_(i)*t_(i))_(i=1)^(k)\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}(piti)i=1k, where z z z\mathrm{z}z is fresh for ( p i t i ) i = 1 k p i t i i = 1 k (p_(i)*t_(i))_(i=1)^(k)\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}(piti)i=1k. Thus, it becomes possible to define functions by cases - a common idiom in ML-theprogramming-language.
1.9.15 EXAMPLE: Using pattern matching, a function that computes the length of a list (Example 1.9.11) may now be written as follows:
letrec length = λ ( N i l . 0 ^ Cons ( _ , z ) . 1 ^ + ^ length z )  letrec length  = λ N i l . 0 ^ Cons _ , z . 1 ^ + ^  length  z " letrec length "=lambda(Nil_(dots).( hat(0))∣Cons(_,z).( hat(1))( hat(+))" length "z)\text { letrec length }=\lambda\left(\mathrm{Nil}_{\ldots} . \hat{0} \mid \operatorname{Cons}\left(\_, z\right) . \hat{1} \hat{+} \text { length } z\right) letrec length =λ(Nil.0^Cons(_,z).1^+^ length z)
The second pattern matches a nonempty list and binds z z z\mathrm{z}z to its tail at the same time, obviating the need for an explicit application of π 2 π 2 pi_(2)\pi_{2}π2.
1.9.16 Exercise [ [ [******[\star \star[, Recommended, \nrightarrow ]: Under the above definition of length, consider an application of length to the list Cons( 0 ^ , N i l ( ) ) 0 ^ , N i l ( ) ) hat(0),Nil())\hat{0}, \mathrm{Nil}())0^,Nil()). After eliminating the syntactic sugar, determine by which reduction sequence this expression reduces to a value.
Before we can proceed and extend the type system to deal with the new match construct, we must make two mild extensions to the syntax and meaning of constraints. First, if σ σ sigma\sigmaσ is X ¯ [ C ] X ¯ [ C ] AA bar(X)[C]\forall \overline{\mathrm{X}}[C]X¯[C].T, where X ¯ # f t v ( T ) X ¯ # f t v T bar(X)#ftv(T^('))\overline{\mathrm{X}} \# f t v\left(\mathrm{~T}^{\prime}\right)X¯#ftv( T), then T σ T σ T^(')-<=sigma\mathrm{T}^{\prime} \preceq \sigmaTσ stands for the constraint X ¯ . ( C T T ) X ¯ . C T T EE bar(X).(C^^T^(') <= T)\exists \overline{\mathrm{X}} .\left(C \wedge \mathrm{T}^{\prime} \leq \mathrm{T}\right)X¯.(CTT). This relation is identical to the instance relation (Definition 1.3.3), except the direction of subtyping is reversed. We extend the syntax of constraints with instantiation constraints of the form T x T x T-<=x\mathrm{T} \preceq \mathrm{x}Tx and define their meaning by adding a symmetric counterpart of CM-
INSTANCE. We remark that, when subtyping is interpreted as equality, the relations σ T σ T sigma-<=T\sigma \preceq \mathrm{T}σT and T σ T σ T-<=sigma\mathrm{T} \preceq \sigmaTσ coincide, so this extension is unnecessary in that particular case. Second, we extend the syntax of environments so that several successive bindings may share a set of quantifiers and a constraint. That is, we allow writing X ¯ [ C ] . ( x 1 : T 1 ; ; x k : T k ) X ¯ [ C ] . x 1 : T 1 ; ; x k : T k AA bar(X)[C].(x_(1):T_(1);dots;x_(k):T_(k))\forall \overline{\mathrm{X}}[C] .\left(\mathrm{x}_{1}: \mathrm{T}_{1} ; \ldots ; \mathrm{x}_{k}: \mathrm{T}_{k}\right)X¯[C].(x1:T1;;xk:Tk) for x 1 : x ¯ [ C ] . T 1 ; ; x k x 1 : x ¯ [ C ] . T 1 ; ; x k x_(1):AA bar(x)[C].T_(1);dots;x_(k)\mathrm{x}_{1}: \forall \overline{\mathrm{x}}[C] . \mathrm{T}_{1} ; \ldots ; \mathrm{x}_{k}x1:x¯[C].T1;;xk : X ¯ [ C ] . T k X ¯ [ C ] . T k AA bar(X)[C].T_(k)\forall \overline{\mathrm{X}}[C] . \mathrm{T}_{k}X¯[C].Tk. From a theoretical standpoint, this is little more than syntactic sugar; however, in practice, it is useful to implement this new idiom literally, since it avoids unnecessary copying of the constraint C C CCC.
Let us now extend the type system. For the sake of brevity, we extend the constraint generation rules only. Of course, it would also be possible to define corresponding extensions of the rule-based type systems shown earlier, namely D M , HM ( X ) D M , HM ( X ) DM,HM(X)\mathrm{DM}, \operatorname{HM}(X)DM,HM(X), and PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X). We begin by defining a constraint [ [ T : p ] ] [ [ T : p ] ] [[T:p]]\llbracket \mathrm{T}: \mathrm{p} \rrbracket[[T:p]] that represents a necessary and sufficient condition for values of type T T T\mathrm{T}T to be acceptable inputs for the pattern p p p\mathrm{p}p. Its free type variables are a subset of
[ [ T : _ ] ] = true [ [ T : z ] ] = T z [ [ T : c p 1 p k ] ] = x ¯ ( x T c i = 1 k [ [ x i : p i ] ] ) [ [ T : p 1 p 2 ] ] = [ [ T : p 1 ] ] [ [ T : p 2 ] ] [ [ T : p 1 p 2 ] ] = [ [ T : p 1 ] ] [ [ T : p 2 ] ] [ [ matcht with ( p i t i ) i = 1 k : T ] ] = i = 1 k let x x ¯ i [ [ [ t : x ] ] let z i : x i in [ [ x : p i ] ] ] ] ( z i : x i ) in [ [ t i : T ] ] where z i = d p i ( p i ) [ [ T : _ ] ] =  true  [ [ T : z ] ] = T z [ [ T : c p 1 p k ] ] = x ¯ x T c i = 1 k [ [ x i : p i ] ] [ [ T : p 1 p 2 ] ] = [ [ T : p 1 ] ] [ [ T : p 2 ] ] [ [ T : p 1 p 2 ] ] = [ [ T : p 1 ] ] [ [ T : p 2 ] ] [ [  matcht with  p i t i i = 1 k : T ] ] = i = 1 k  let  x x ¯ i [ [ t : x ] ]  let  z i : x i  in  [ [ x : p i ] ] ] ] z i : x i  in  [ [ t i : T ] ]  where  z i = d p i p i {:[[[T:_]]=" true "],[[[T:z]]=T-<=z],[[[T:cp_(1)cdotsp_(k)]]=EE bar(x)*( vec(x)rarrT-<=c^^^^_(i=1)^(k)([[)x_(i):p_(i)(]]))],[[[T:p_(1)^^p_(2)]]=[[T:p_(1)]]^^[[T:p_(2)]]],[[[T:p_(1)vvp_(2)]]=[[T:p_(1)]]^^[[T:p_(2)]]],[[[" matcht with "(p_(i)*t_(i))_(i=1)^(k):T]]=^^^_(i=1)^(k)" let "AAx bar(x)_(i)[([[)t:x(]])^^" let " vec(z)_(i): vec(x)_(i)" in "([[)x:p_(i)(]])(]])*( vec(z)_(i): vec(x)_(i))" in "([[)t_(i):T(]]):}],[" where " vec(z)_(i)=dpi(p_(i))]:}\begin{aligned} & \llbracket \mathrm{T}: \_\rrbracket=\text { true } \\ & \llbracket \mathrm{T}: \mathrm{z} \rrbracket=\mathrm{T} \preceq \mathrm{z} \\ & \llbracket \mathrm{T}: \mathrm{c} \mathrm{p}_{1} \cdots \mathrm{p}_{k} \rrbracket=\exists \overline{\mathrm{x}} \cdot\left(\overrightarrow{\mathrm{x}} \rightarrow \mathrm{T} \preceq \mathrm{c} \wedge \wedge_{i=1}^{k} \llbracket \mathrm{x}_{i}: \mathrm{p}_{i} \rrbracket\right) \\ & \llbracket \mathrm{T}: \mathrm{p}_{1} \wedge \mathrm{p}_{2} \rrbracket=\llbracket \mathrm{T}: \mathrm{p}_{1} \rrbracket \wedge \llbracket \mathrm{T}: \mathrm{p}_{2} \rrbracket \\ & \llbracket \mathrm{T}: \mathrm{p}_{1} \vee \mathrm{p}_{2} \rrbracket=\llbracket \mathrm{T}: \mathrm{p}_{1} \rrbracket \wedge \llbracket \mathrm{T}: \mathrm{p}_{2} \rrbracket \\ & \llbracket \text { matcht with }\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}: \mathrm{T} \rrbracket=\bigwedge_{i=1}^{k} \text { let } \forall \mathrm{x} \overline{\mathrm{x}}_{i}\left[\llbracket \mathrm{t}: \mathrm{x} \rrbracket \wedge \text { let } \overrightarrow{\mathrm{z}}_{i}: \overrightarrow{\mathrm{x}}_{i} \text { in } \llbracket \mathrm{x}: \mathrm{p}_{i} \rrbracket \rrbracket \cdot\left(\overrightarrow{\mathrm{z}}_{i}: \overrightarrow{\mathrm{x}}_{i}\right) \text { in } \llbracket \mathrm{t}_{i}: \mathrm{T} \rrbracket\right. \\ & \text { where } \overrightarrow{\mathrm{z}}_{i}=d p i\left(\mathrm{p}_{i}\right) \end{aligned}[[T:_]]= true [[T:z]]=Tz[[T:cp1pk]]=x¯(xTci=1k[[xi:pi]])[[T:p1p2]]=[[T:p1]][[T:p2]][[T:p1p2]]=[[T:p1]][[T:p2]][[ matcht with (piti)i=1k:T]]=i=1k let xx¯i[[[t:x]] let zi:xi in [[x:pi]]]](zi:xi) in [[ti:T]] where zi=dpi(pi)
Figure 1-15: Constraint generation for patterns and pattern matching
f t v ( T ) f t v ( T ) ftv(T)f t v(T)ftv(T), while its free program identifiers are either constructors or program variables bound by p p p\mathrm{p}p. It is defined in the upper part of Figure 1-15. The first rule states that a wildcard matches values of arbitrary type. The second and third rules govern program variables and constructor applications in patterns. They are identical to the rules that govern these constructs in expressions (page 59), except that the direction of subtyping is reversed. In the absence of subtyping, they would be entirely identical. We write X X vec(X)\overrightarrow{\mathrm{X}}X for X 1 X k X 1 X k X_(1)dotsX_(k)\mathrm{X}_{1} \ldots \mathrm{X}_{k}X1Xk and X T X T vec(X)rarrT\overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}XT for X 1 X k T X 1 X k T X_(1)rarr dots rarrX_(k)rarrT\mathrm{X}_{1} \rightarrow \ldots \rightarrow \mathrm{X}_{k} \rightarrow \mathrm{T}X1XkT. As usual, the type variables X 1 , , X k X 1 , , X k X_(1),dots,X_(k)\mathrm{X}_{1}, \ldots, \mathrm{X}_{k}X1,,Xk must have kind ***\star and must be distinct and fresh for the equation's left-hand side. The last two rules simply distribute the type T T T\mathrm{T}T to both subpatterns. It is easy to check that [ [ T : p ] ] [ [ T : p ] ] [[T:p]]\llbracket \mathrm{T}: \mathrm{p} \rrbracket[[T:p]] is contravariant in T T T\mathrm{T}T :
1.9.17 Lemma: T T [ [ T : p ] ] T T [ [ T : p ] ] T^(') <= T^^[[T:p]]\mathrm{T}^{\prime} \leq \mathrm{T} \wedge \llbracket \mathrm{T}: \mathrm{p} \rrbracketTT[[T:p]] entails [ [ T : p ] ] [ [ T : p ] ] [[T^('):p]]\llbracket \mathrm{T}^{\prime}: \mathrm{p} \rrbracket[[T:p]].
This property reflects the fact that T T T\mathrm{T}T represents the type of an input for the pattern p. Compare it with Lemma 1.6.3.
1.9.18 Example: Consider the pattern Cons ( , z ) , z (_(-),z)\left({ }_{-}, \mathbf{z}\right)(,z), which appears in Example 1.9.15. We have
[ [ T : Cons ( z ) ] ] Z 1 ( [ [ Z 1 T : Cons ] ] [ [ Z 1 : ( , z ) ] ] ) Z 1 ( Z 1 T Cons Z 2 Z 3 ( [ [ Z 2 Z 3 Z 1 : ( , ) ] ] [ [ Z 2 : _ ] ] [ [ Z 3 : z ] ] ) ) Z 1 Z 2 Z 3 ( Z 1 T Cons Z 2 Z 3 Z 1 ( , ) Z 3 z ) [ [ T : Cons ( z ) ] ] Z 1 [ [ Z 1 T :  Cons  ] ] [ [ Z 1 : ( , z ) ] ] Z 1 Z 1 T  Cons  Z 2 Z 3 [ [ Z 2 Z 3 Z 1 : ( , ) ] ] [ [ Z 2 : _ ] ] [ [ Z 3 : z ] ] Z 1 Z 2 Z 3 Z 1 T Cons Z 2 Z 3 Z 1 ( , ) Z 3 z {:[[[T:Cons(-z)]]],[-=EEZ_(1)*(([[)Z_(1)rarrT:" Cons "(]])^^([[)Z_(1):(,z)(]]))],[-=EEZ_(1)*(Z_(1)rarrT-<=" Cons "^^EEZ_(2)Z_(3)*(([[)Z_(2)rarrZ_(3)rarrZ_(1):(*,*)(]])^^([[)Z_(2):_(]])^^([[)Z_(3):z(]])))],[-=EEZ_(1)Z_(2)Z_(3)*(Z_(1)rarrT-<=Cons^^Z_(2)rarrZ_(3)rarrZ_(1)-<=(*,*)^^Z_(3)-<=z)]:}\begin{aligned} & \llbracket \mathrm{T}: \operatorname{Cons}(-\mathrm{z}) \rrbracket \\ \equiv & \exists \mathrm{Z}_{1} \cdot\left(\llbracket \mathrm{Z}_{1} \rightarrow \mathrm{T}: \text { Cons } \rrbracket \wedge \llbracket \mathrm{Z}_{1}:(, \mathrm{z}) \rrbracket\right) \\ \equiv & \exists \mathrm{Z}_{1} \cdot\left(\mathrm{Z}_{1} \rightarrow \mathrm{T} \preceq \text { Cons } \wedge \exists \mathrm{Z}_{2} \mathrm{Z}_{3} \cdot\left(\llbracket \mathrm{Z}_{2} \rightarrow \mathrm{Z}_{3} \rightarrow \mathrm{Z}_{1}:(\cdot, \cdot) \rrbracket \wedge \llbracket \mathrm{Z}_{2}: \_\rrbracket \wedge \llbracket \mathrm{Z}_{3}: \mathrm{z} \rrbracket\right)\right) \\ \equiv & \exists \mathrm{Z}_{1} \mathrm{Z}_{2} \mathrm{Z}_{3} \cdot\left(\mathrm{Z}_{1} \rightarrow \mathrm{T} \preceq \operatorname{Cons} \wedge \mathrm{Z}_{2} \rightarrow \mathrm{Z}_{3} \rightarrow \mathrm{Z}_{1} \preceq(\cdot, \cdot) \wedge \mathrm{Z}_{3} \preceq \mathrm{z}\right) \end{aligned}[[T:Cons(z)]]Z1([[Z1T: Cons ]][[Z1:(,z)]])Z1(Z1T Cons Z2Z3([[Z2Z3Z1:(,)]][[Z2:_]][[Z3:z]]))Z1Z2Z3(Z1TConsZ2Z3Z1(,)Z3z)
where Z 1 , Z 2 , Z 3 Z 1 , Z 2 , Z 3 Z_(1),Z_(2),Z_(3)\mathrm{Z}_{1}, \mathrm{Z}_{2}, \mathrm{Z}_{3}Z1,Z2,Z3 are fresh for T T T\mathrm{T}T. Let us now place this constraint within the scope of the initial environment, which assigns type schemes to the constructors Cons and ( , ) ( , ) (*,*)(\cdot, \cdot)(,), and within the scope of a binding of z z zzz to some type T T T^(')\mathrm{T}^{\prime}T.
We find
let Γ 0 in let z : T in [ [ T : Cons ( , z ) ] ] Z 1 Z 2 Z 3 ( X ( Z 1 T X × list X list X ) Y 1 Y 2 ( Z 2 Z 3 Z 1 Y 1 Y 2 Y 1 × Y 2 ) Z 3 T ) X ( T list X list X T )  let  Γ 0  in let  z : T  in  [ [ T :  Cons  ( , z ) ] ] Z 1 Z 2 Z 3 X Z 1 T X ×  list  X  list  X Y 1 Y 2 Z 2 Z 3 Z 1 Y 1 Y 2 Y 1 × Y 2 Z 3 T X T  list  X  list  X T {:[" let "Gamma_(0)" in let "z:T^(')" in "[[T:" Cons "(","z)]]],[-=EEZ_(1)Z_(2)Z_(3)*(EEX*(Z_(1)rarrT <= Xxx" list "Xrarr" list "X)^^:}],[-={: EEY_(1)Y_(2)*(Z_(2)rarrZ_(3)rarrZ_(1) <= Y_(1)rarrY_(2)rarrY_(1)xxY_(2))^^Z_(3) <= T^('))],[-=EEX*(T <= " list "X^^" list "X <= T^('))]:}\begin{aligned} & \text { let } \Gamma_{0} \text { in let } \mathrm{z}: \mathrm{T}^{\prime} \text { in } \llbracket \mathrm{T}: \text { Cons }(, \mathrm{z}) \rrbracket \\ \equiv & \exists \mathrm{Z}_{1} \mathrm{Z}_{2} \mathrm{Z}_{3} \cdot\left(\exists \mathrm{X} \cdot\left(\mathrm{Z}_{1} \rightarrow \mathrm{T} \leq \mathrm{X} \times \text { list } \mathrm{X} \rightarrow \text { list } \mathrm{X}\right) \wedge\right. \\ \equiv & \left.\exists \mathrm{Y}_{1} \mathrm{Y}_{2} \cdot\left(\mathrm{Z}_{2} \rightarrow \mathrm{Z}_{3} \rightarrow \mathrm{Z}_{1} \leq \mathrm{Y}_{1} \rightarrow \mathrm{Y}_{2} \rightarrow \mathrm{Y}_{1} \times \mathrm{Y}_{2}\right) \wedge \mathrm{Z}_{3} \leq \mathrm{T}^{\prime}\right) \\ \equiv & \exists \mathrm{X} \cdot\left(\mathrm{T} \leq \text { list } \mathrm{X} \wedge \text { list } \mathrm{X} \leq \mathrm{T}^{\prime}\right) \end{aligned} let Γ0 in let z:T in [[T: Cons (,z)]]Z1Z2Z3(X(Z1TX× list X list X)Y1Y2(Z2Z3Z1Y1Y2Y1×Y2)Z3T)X(T list X list XT)
where the final simplification relies mainly on C-ARROw, on the corresponding rule for products, and on C-ExTrans, and is left as an exercise to the reader. Thus, the constraint states that the pattern matches values that have type list X X X\mathrm{X}X (equivalently, values whose type T T T\mathrm{T}T is a subtype of list X X X\mathrm{X}X ), for some undetermined element type X X X\mathrm{X}X, and binds z z z\mathbf{z}z to values of type list X X X\mathrm{X}X (equivalently, values whose type T T T^(')T^{\prime}T is a supertype of list X X XXX ).
The above example seems to indicate that the constraint generation rules for patterns make some sense. Still, the careful reader may be somewhat puzzled by the third rule, which, compared to its analogue for expressions, reverses the direction of subtyping, but does not reverse the direction of instantiation. Indeed, in order for this rule to make sense, and to be sound, we must formulate a requirement concerning the type schemes assigned to constructors.
1.9.19 Definition: A constructor c c ccc is invertible if and only if, when X X vec(X)\vec{X}X and X X vec(X)^(')\vec{X}^{\prime}X have length a ( c ) a ( c ) a(c)a(\mathrm{c})a(c), the constraint let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in ( X T c c X T ) X T c c X T ( vec(X)^(')rarrT-<=c^^c-<= vec(X)rarrT)\left(\overrightarrow{\mathrm{X}}^{\prime} \rightarrow \mathrm{T} \preceq \mathrm{c} \wedge \mathrm{c} \preceq \overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}\right)(XTccXT) entails X X X X vec(X) <= vec(X)^(')\overrightarrow{\mathrm{X}} \leq \overrightarrow{\mathrm{X}}^{\prime}XX. In the following, we assume patterns contain invertible constructors only.
Intuitively, when c c c\mathrm{c}c is invertible, it is possible to recover the type of every v i v i v_(i)\mathrm{v}_{i}vi from the type of c v 1 v k c v 1 v k cv_(1)dotsv_(k)c \mathrm{v}_{1} \ldots \mathrm{v}_{k}cv1vk, a crucial property for pattern matching to be possible. Please note that, if Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c) is monomorphic, then c c c\mathrm{c}c is invertible. The following lemma identifies another important class of invertible constructors.
1.9.20 Lemma: The constructors of algebraic data types are invertible.
Proof: Let c c c\mathrm{c}c be a constructor introduced by the definition of an algebraic data type D. Let k = a ( c ) k = a ( c ) k=a(c)k=a(\mathrm{c})k=a(c). Then, the type scheme Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c) is of the form Y ¯ . T D Y Y ¯ . T D Y AA bar(Y). vec(T)rarrD vec(Y)\forall \overline{\mathrm{Y}} . \overrightarrow{\mathrm{T}} \rightarrow \mathrm{D} \overrightarrow{\mathrm{Y}}Y¯.TDY, where Y Y vec(Y)\overrightarrow{\mathrm{Y}}Y are the parameters of the definition and T T vec(T)\overrightarrow{\mathrm{T}}T, a vector of length k k kkk, consists of some of the definition's components. (More precisely, T T vec(T)\overrightarrow{\mathrm{T}}T contains just one component in the case of variant types and contains all components in the case of record types.) Let X X vec(X)\overrightarrow{\mathrm{X}}X and X X vec(X)^(')\overrightarrow{\mathrm{X}}^{\prime}X have length k k kkk. Let Y 1 T 1 D Y 1 Y 1 T 1 D Y 1 AA vec(Y)_(1)* vec(T)_(1)rarrD vec(Y)_(1)\forall \overrightarrow{\mathrm{Y}}_{1} \cdot \overrightarrow{\mathrm{T}}_{1} \rightarrow \mathrm{D} \overrightarrow{\mathrm{Y}}_{1}Y1T1DY1 and Y ¯ 2 T 2 D Y 2 Y ¯ 2 T 2 D Y 2 AA bar(Y)_(2)* vec(T)_(2)rarrD vec(Y)_(2)\forall \overline{\mathrm{Y}}_{2} \cdot \overrightarrow{\mathrm{T}}_{2} \rightarrow \mathrm{D} \overrightarrow{\mathrm{Y}}_{2}Y¯2T2DY2 be two α α alpha\alphaα-equivalent forms of the type scheme Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c), with Y ¯ 1 # Y ¯ 2 Y ¯ 1 # Y ¯ 2 bar(Y)_(1)# bar(Y)_(2)\overline{\mathrm{Y}}_{1} \# \overline{\mathrm{Y}}_{2}Y¯1#Y¯2 and Y ¯ 1 Y ¯ 2 # ftv ( X ¯ , X ¯ , T ) Y ¯ 1 Y ¯ 2 # ftv X ¯ , X ¯ , T bar(Y)_(1) bar(Y)_(2)#ftv( bar(X), bar(X)^('),T)\overline{\mathrm{Y}}_{1} \overline{\mathrm{Y}}_{2} \# \operatorname{ftv}\left(\overline{\mathrm{X}}, \overline{\mathrm{X}}^{\prime}, \mathrm{T}\right)Y¯1Y¯2#ftv(X¯,X¯,T). The constraint let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in ( X T c c X T c c ( vec(X)^(')rarrT-<=c^^c-<=:}\left(\overrightarrow{\mathrm{X}}^{\prime} \rightarrow \mathrm{T} \preceq \mathrm{c} \wedge \mathrm{c} \preceq\right.(XTcc X T X T vec(X)rarrT\overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}XT ) is, by definition, equivalent to X T Γ 0 ( c ) Γ 0 ( c ) X T X T Γ 0 ( c ) Γ 0 ( c ) X T vec(X)^(')rarrT-<Gamma_(0)(c)^^Gamma_(0)(c)-< vec(X)rarrT\overrightarrow{\mathrm{X}}^{\prime} \rightarrow \mathrm{T} \prec \Gamma_{0}(\mathrm{c}) \wedge \Gamma_{0}(\mathrm{c}) \prec \overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}XTΓ0(c)Γ0(c)XT, that is, Y ¯ 1 ( X T T 1 D Y 1 ) Y ¯ 2 ( T 2 D Y 2 X T ) Y ¯ 1 X T T 1 D Y 1 Y ¯ 2 T 2 D Y 2 X T EE bar(Y)_(1)*( vec(X)^(')rarrT <= vec(T)_(1)rarrD vec(Y)_(1))^^EE bar(Y)_(2)*( vec(T)_(2)rarrD vec(Y)_(2) <= vec(X)rarrT)\exists \overline{\mathrm{Y}}_{1} \cdot\left(\overrightarrow{\mathrm{X}}^{\prime} \rightarrow \mathrm{T} \leq \overrightarrow{\mathrm{T}}_{1} \rightarrow \mathrm{D} \overrightarrow{\mathrm{Y}}_{1}\right) \wedge \exists \overline{\mathrm{Y}}_{2} \cdot\left(\overrightarrow{\mathrm{T}}_{2} \rightarrow \mathrm{D} \overrightarrow{\mathrm{Y}}_{2} \leq \overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}\right)Y¯1(XTT1DY1)Y¯2(T2DY2XT). By C-ExAND and CARRow, this may be written Y ¯ 1 Y ¯ 2 ( D Y 2 T D Y 1 X T 2 T 1 X ) Y ¯ 1 Y ¯ 2 D Y 2 T D Y 1 X T 2 T 1 X EE bar(Y)_(1) bar(Y)_(2)*(D vec(Y)_(2) <= T <= D vec(Y)_(1)^^( vec(X)) <= vec(T)_(2)^^ vec(T)_(1) <= vec(X)^('))\exists \bar{Y}_{1} \bar{Y}_{2} \cdot\left(D \vec{Y}_{2} \leq T \leq D \vec{Y}_{1} \wedge \vec{X} \leq \vec{T}_{2} \wedge \vec{T}_{1} \leq \vec{X}^{\prime}\right)Y¯1Y¯2(DY2TDY1XT2T1X). Now,
by Definition 1.9.8, D Y 2 D Y 1 Y 2 D Y 1 vec(Y)_(2) <= D vec(Y)_(1)\overrightarrow{\mathrm{Y}}_{2} \leq \mathrm{D} \overrightarrow{\mathrm{Y}}_{1}Y2DY1 entails T 2 T 1 T 2 T 1 vec(T)_(2) <= vec(T)_(1)\overrightarrow{\mathrm{T}}_{2} \leq \overrightarrow{\mathrm{T}}_{1}T2T1, so the previous constraint entails Y ¯ 1 Y ¯ 2 ( X X ) Y ¯ 1 Y ¯ 2 X X EE bar(Y)_(1) bar(Y)_(2)*( vec(X) <= vec(X)^('))\exists \bar{Y}_{1} \overline{\mathrm{Y}}_{2} \cdot\left(\overrightarrow{\mathrm{X}} \leq \overrightarrow{\mathrm{X}}^{\prime}\right)Y¯1Y¯2(XX), that is, X X X X vec(X) <= vec(X)^(')\overrightarrow{\mathrm{X}} \leq \overrightarrow{\mathrm{X}}^{\prime}XX.
An important class of noninvertible constructors are those associated with existential type definitions (page 118), where not all quantifiers of the type scheme Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(c)Γ0(c) are parameters of the type constructor D. For instance, under the definition D : X . X D : X . X D~~ℓ:EEX.X\mathrm{D} \approx \ell: \exists \mathrm{X} . \mathrm{X}D:X.X, the type scheme associated with \ell is X . X D X . X D AAX.XrarrD\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{D}X.XD. Then, it is easy to check that \ell is not invertible. This reflects the fact that it is not possible to recover the type of v v v\mathrm{v}v from the type of v v ℓv\ell \mathrm{v}v-which must be D D DDD in any case-and explains why existential types require special treatment.
We are now ready to associate a constraint generation rule with the match construct. It is given in the lower part of Figure 1-15. In the rule's right-hand side, we write z i z i vec(z)_(i)\overrightarrow{\mathbf{z}}_{i}zi for the program variables bound by the pattern p i p i p_(i)\mathrm{p}_{i}pi, and we write X i X i vec(X)_(i)\overrightarrow{\mathrm{X}}_{i}Xi for a vector of type variables of the same length. The type variables X X ¯ i X X ¯ i X bar(X)_(i)\mathrm{X} \overline{\mathrm{X}}_{i}XX¯i must have kind ***\star, must be pairwise distinct and must not appear free in the rule's left-hand side. Let us now explain the rule. Its right-hand side is a conjunction, where each conjunct deals with one clause of the match construct, requiring t i t i t_(i)t_{i}ti to have type T T TTT under certain assumptions about the program variables z i z i vec(z)_(i)\vec{z}_{i}zi bound by the pattern p i p i p_(i)\mathrm{p}_{i}pi. There remains to explain how these assumptions are built. First, as in the case of a let construct, we summon a fresh type variable X X X\mathrm{X}X and produce [ [ t : x ] ] [ [ t : x ] ] [[t:x]]\llbracket \mathrm{t}: \mathrm{x} \rrbracket[[t:x]], the least specific constraint that guarantees t t ttt has type X X XXX. Then, reflecting the operational semantics, which feeds (the value produced by) t t ttt into the pattern p i p i p_(i)\mathrm{p}_{i}pi, we feed the type X X X\mathrm{X}X into p i p i p_(i)\mathrm{p}_{i}pi and produce let z i : X i z i : X i vec(z)_(i): vec(X)_(i)\overrightarrow{\mathrm{z}}_{i}: \overrightarrow{\mathrm{X}}_{i}zi:Xi in [ [ X : p i ] ] [ [ X : p i ] ] [[X:p_(i)]]\llbracket \mathrm{X}: \mathrm{p}_{i} \rrbracket[[X:pi]], a constraint that guarantees that X i X i vec(X)_(i)\overrightarrow{\mathrm{X}}_{i}Xi is a correct vector of type assumptions for the program variables z i z i vec(z)_(i)\vec{z}_{i}zi (see Example 1.9.18). This explains why we may place [ [ T : t i ] ] [ [ T : t i ] ] [[T:t_(i)]]\llbracket \mathrm{T}: \mathrm{t}_{i} \rrbracket[[T:ti]] within the scope of ( z i : x i ) z i : x i ( vec(z)_(i): vec(x)_(i))\left(\vec{z}_{i}: \overrightarrow{\mathrm{x}}_{i}\right)(zi:xi). There remains to point out that, as in the case of the let construct, every assignment of ground types to X X ¯ i X X ¯ i X bar(X)_(i)\mathrm{X} \overline{\mathrm{X}}_{i}XX¯i that satisfies the constraint [ [ t : X ] ] [ [ t : X ] ] [[t:X]]^^\llbracket \mathrm{t}: \mathrm{X} \rrbracket \wedge[[t:X]] let z i : X i z i : X i vec(z)_(i): vec(X)_(i)\overrightarrow{\mathrm{z}}_{i}: \overrightarrow{\mathrm{X}}_{i}zi:Xi in [ [ X : p i ] ] [ [ X : p i ] ] [[X:p_(i)]]\llbracket \mathrm{X}: \mathrm{p}_{i} \rrbracket[[X:pi]] is acceptable, so it is valid to universally quantify these type variables. This allows the program variables z i z i vec(z)_(i)\vec{z}_{i}zi to receive polymorphic type schemes when t t ttt itself has polymorphic type.
1.9.21 Exercise [ ***\star, Recommended]: We have previously suggested viewing let z = t 1 z = t 1 z=t_(1)\mathrm{z}=\mathrm{t}_{1}z=t1 in t 2 t 2 t_(2)\mathrm{t}_{2}t2 as syntactic sugar for match t 1 t 1 t_(1)\mathrm{t}_{1}t1 with z . t 2 z . t 2 z.t_(2)\mathrm{z} . \mathrm{t}_{2}z.t2, and shown that the operational semantics validates this view. Check that it is also valid from a typing perspective.
The match constraint generation rule, if implemented literally, takes k k kkk copies of the constraint [ [ t : x ] ] [ [ t : x ] ] [[t:x]]\llbracket \mathrm{t}: \mathrm{x} \rrbracket[[t:x]]. When k k kkk is greater than 1 , this compromises the linear time and space complexity of constraint generation. To remedy this problem, one may modify the rule as follows: replace every copy of [ [ t : x ] ] [ [ t : x ] ] [[t:x]]\llbracket t: x \rrbracket[[t:x]] with z x z x z-<=xz \preceq xzx and place the constraint within the context let z : x [ [ [ t : x ] ] ] . x z : x [ [ [ t : x ] ] ] . x z:AA x[[[t:x]]].xz: \forall x[\llbracket t: x \rrbracket] . xz:x[[[t:x]]].x in [] , where z z zzz is
a fresh program variable. It is not difficult to check that the logical meaning of the constraint is not affected and that a linear behavior is recovered. In practice, solving the new constraint requires taking instances of the type scheme x [ [ [ t : x ] ] ] . x x [ [ [ t : x ] ] ] . x AA x[[[t:x]]].x\forall x[\llbracket t: x \rrbracket] . xx[[[t:x]]].x, which essentially requires copying [ [ t : x ] ] [ [ t : x ] ] [[t:x]]\llbracket t: x \rrbracket[[t:x]] again-however, an efficient solver may now simplify this subconstraint before duplicating it.
The following lemma is a key to establishing subject reduction for RMATCH. It relies on the requirement that constructors be invertible.
1.9.22 Lemma: Assume [ p v ] [ p v ] [p|->v][\mathrm{p} \mapsto \mathrm{v}][pv] is defined and maps z z vec(z)\overrightarrow{\mathrm{z}}z to w w vec(w)\overrightarrow{\mathrm{w}}w, where z ¯ = d p i ( p ) z ¯ = d p i ( p ) bar(z)=dpi(p)\overline{\mathbf{z}}=d p i(\mathrm{p})z¯=dpi(p). Let z : T z : T vec(z): vec(T)\vec{z}: \vec{T}z:T be an arbitrary monomorphic environment of domain z ¯ z ¯ bar(z)\bar{z}z¯. Then, let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in ( [ [ v : T ] ] ( [ [ v : T ] ] ([[v:T]]^^(\llbracket \mathrm{v}: \mathrm{T} \rrbracket \wedge([[v:T]] let z : T z : T vec(z): vec(T)\overrightarrow{\mathrm{z}}: \overrightarrow{\mathrm{T}}z:T in [ [ T : p ] ] ) [ [ T : p ] ] ) [[T:p]])\llbracket \mathrm{T}: \mathrm{p} \rrbracket)[[T:p]]) entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ w : T ] ] [ [ w : T ] ] [[ vec(w): vec(T)]]\llbracket \overrightarrow{\mathrm{w}}: \overrightarrow{\mathrm{T}} \rrbracket[[w:T]].
We now prove that our extension of ML-the-calculus with pattern matching enjoys subject reduction. We only state that R-MATCH preserves types, and leave the new subcase of R-CONTEXT, where the evaluation context involves a match construct, to the reader. For this subcase to succeed, the value restriction (Definition 1.7.7) must be extended to require that either all constants have pure semantics or all match constructs are in fact of the form match v with ( p i t i ) i = 1 k p i t i i = 1 k (p_(i)*t_(i))_(i=1)^(k)\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}(piti)i=1k.

1.9.23 ThEOREM [SUBJECT REDUCTION]: (R-MATCH) ( ) ( ) sube(⊑)\subseteq(\sqsubseteq)()

1.9.24 ExERcise [ , ] [ , ] [*********,↛][\star \star \star, \nrightarrow][,] : For the sake of simplicity, we have omitted the production ref p p ppp from the syntax of patterns. The pattern ref p p ppp matches every memory location whose content (with respect to the current store) is matched by p. Determine how the previous definitions and proofs must be extended in order to accommodate this new production.
The progress property does not hold in general: for instance, match Nil with (Cons z.z) is well-typed (with type AA\forall X.X) but is stuck. In actual implementations of ML-the-programming-language, such errors are dynamically detected. This may be considered a weakness of ML-the-typesystem. Fortunately, however, it is often possible to statically prove that a particular match construct is exhaustive and cannot go wrong. Indeed, if match v v v\mathrm{v}v with ( p i t i ) i = 1 k p i t i i = 1 k (p_(i)*t_(i))_(i=1)^(k)\left(\mathrm{p}_{i} \cdot \mathrm{t}_{i}\right)_{i=1}^{k}(piti)i=1k is well-typed, then for every i { 1 , , k } i { 1 , , k } i in{1,dots,k}i \in\{1, \ldots, k\}i{1,,k}, the constraint let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in ( [ [ v : X ] ] X ¯ [ [ v : X ] ] X ¯ (([[)v:X(]])^^EE bar(X):}\left(\llbracket \mathrm{v}: \mathrm{X} \rrbracket \wedge \exists \overline{\mathrm{X}}\right.([[v:X]]X¯. let z i : X z i : X vec(z)_(i): vec(X)\overrightarrow{\mathrm{z}}_{i}: \overrightarrow{\mathrm{X}}zi:X in [ [ X : p i ] ] ) [ [ X : p i ] ] {:([[)X:p_(i)(]]))\left.\llbracket \mathrm{X}: \mathrm{p}_{i} \rrbracket\right)[[X:pi]]), where z ¯ i z ¯ i bar(z)_(i)\overline{\mathrm{z}}_{i}z¯i are the program variables bound by p i p i p_(i)\mathrm{p}_{i}pi, must be satisfiable; that is, v v v\mathrm{v}v must have some type that is an acceptable input for p i p i p_(i)\mathrm{p}_{i}pi. This fact yields information about v v v\mathrm{v}v, from which it may be possible to derive that v v v\mathrm{v}v must match one of the patterns p i p i p_(i)\mathrm{p}_{i}pi.
1.9.25 ExAmple: Let k = 2 , p 1 = N i l k = 2 , p 1 = N i l k=2,p_(1)=Nilk=2, \mathrm{p}_{1}=\mathrm{Nil}k=2,p1=Nil, and p 2 = p 2 = p_(2)=\mathrm{p}_{2}=p2= Cons ( z 1 , z 2 ) z 1 , z 2 (z_(1),z_(2))\left(\mathbf{z}_{1}, \mathbf{z}_{2}\right)(z1,z2). Then, the constraints let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯.let z i : X z i : X vec(z)_(i): vec(X)\overrightarrow{\mathrm{z}}_{i}: \overrightarrow{\mathrm{X}}zi:X in [ [ X : p i ] ] [ [ X : p i ] ] [[X:p_(i)]]\llbracket \mathrm{X}: \mathrm{p}_{i} \rrbracket[[X:pi]], for i { 1 , 2 } i { 1 , 2 } i in{1,2}i \in\{1,2\}i{1,2}, are both equivalent (after simplification, when i = 2 i = 2 i=2i=2i=2 ) to Z Z EEZ\exists \mathrm{Z}Z. X X X <=\mathrm{X} \leqX list Z Z Z\mathrm{Z}Z. Because the type constructor list is isolated, every closed value v v v\mathrm{v}v whose type X X X\mathrm{X}X satisfies this constraint
must be an application of N i l N i l Nil\mathrm{Nil}Nil or Cons. If the latter, because Cons has type X . X × X . X × AAX.Xxx\forall \mathrm{X} . \mathrm{X} \timesX.X× list X X Xrarr\mathrm{X} \rightarrowX list X X X\mathrm{X}X, and because the type constructor × × xx\times× is isolated, the argument to Cons must be a pair. We conclude that v v vvv must match either p 1 p 1 p_(1)\mathrm{p}_{1}p1 or p 2 p 2 p_(2)\mathrm{p}_{2}p2, which guarantees that this match construct is exhaustive and its evaluation cannot go wrong.
It is beyond the scope of this chapter to give more details about the check for exhaustiveness. The reader is referred to (Sekar, Ramesh, and Ramakrishnan, 1995; Le Fessant and Maranget, 2001).

Type annotations

So far, we have been interested in a very pure, and extreme, form of type inference. Indeed, in ML-the-calculus, expressions contain no explicit type information whatsoever: it is entirely inferred. In practice, however, it is often useful to insert type annotations within expressions, because they provide a form of machine-checked documentation. Type annotations are also helpful when attempting to trace the cause of a type error: by supplying the typechecker with (supposedly) correct type information, one runs a better chance of finding a type inconsistency near an actual programming mistake.
When type annotations are allowed to contain type variables, one must be quite careful about where (at which program point) and how (existentially or universally) these variables are bound. Indeed, the meaning of type annotations cannot be made precise without settling these issues. In what follows, we first explain how to introduce type annotations whose type variables are bound locally and existentially. We show that extending ML-the-calculus with such limited type annotations is again a simple matter of introducing new constants. Then, we turn to a more general case, where type variables may be explicitly existentially introduced at any program point. We defer the discussion of universally bound type variables to Section 1.10.
Let a local existential type annotation X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. T T T\mathrm{T}T be a pair of a set of type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ and a type T T T\mathrm{T}T, where T T T\mathrm{T}T has kind , X ¯ , X ¯ ***, bar(X)\star, \overline{\mathrm{X}},X¯ is considered bound within T T T\mathrm{T}T, and X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ contains f t v ( T ) f t v ( T ) ftv(T)f t v(\mathrm{~T})ftv( T). For every such annotation, we introduce a new unary destructor ( : X ¯ . T ) ( : X ¯ . T ) (*:EE bar(X).T)(\cdot: \exists \overline{\mathrm{X}} . \mathrm{T})(:X¯.T). Such a definition is valid only because a type annotation must be closed, that is, does not have any free type variables. We write ( t : X ¯ . T ) ( t : X ¯ . T ) (t:EE bar(X).T)(t: \exists \bar{X} . T)(t:X¯.T) for the application ( ( : X ¯ . T ) ) t ( ( : X ¯ . T ) ) t ((*:EE bar(X).T))t((\cdot: \exists \bar{X} . T)) t((:X¯.T))t. Since a type annotation does not affect the meaning of a program, the new destructor has identity semantics:
( v : X ¯ . T ) δ v ( v : X ¯ . T ) δ v (v:EE bar(X).T)rarr"delta"v(\mathrm{v}: \exists \overline{\mathrm{X}} . \mathrm{T}) \xrightarrow{\delta} \mathrm{v}(v:X¯.T)δv
(R-AnNotation)
Its type scheme, however, is not that of the identity, namely X . X X X . X X AA X.X rarr X\forall X . X \rightarrow XX.XX : instead, it is less general, so that annotating an expression restricts its type. Indeed,
we extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the binding
( : X ¯ . T ) : X ¯ . T T ( : X ¯ . T ) : X ¯ . T T (*:EE bar(X).T):AA bar(X).TrarrT(\cdot: \exists \overline{\mathrm{X}} . \mathrm{T}): \forall \overline{\mathrm{X}} . \mathrm{T} \rightarrow \mathrm{T}(:X¯.T):X¯.TT
1.9.26 EXERCISE [ ***\star ]: Check that X ¯ . T T X ¯ . T T AA bar(X).TrarrT\forall \overline{\mathrm{X}} . \mathrm{T} \rightarrow \mathrm{T}X¯.TT is an instance of X . X X X . X X AAX.XrarrX\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}X.XX in Damas and Milner's sense, that is, the former is obtained from the latter via the rule DM-INST' given in Exercise 1.2.23. Does this allow arguing that the type scheme assigned to ( : X ¯ . T ) ( : X ¯ . T ) (*:EE bar(X).T)(\cdot: \exists \overline{\mathrm{X}} . \mathrm{T})(:X¯.T) is sound? Check that the above definitions meet the requirements of Definition 1.7.6.
Although inserting a type annotation does not change the semantics of the program, it does affect constraint generation, hence type inference. We let the reader check that, assuming X ¯ # f t v ( t , T ) X ¯ # f t v t , T bar(X)#ftv(t,T^('))\overline{\mathrm{X}} \# f t v\left(\mathrm{t}, \mathrm{T}^{\prime}\right)X¯#ftv(t,T), the following derived constraint generation rule holds:
let Γ 0 in [ [ ( t : X ¯ . T ) : T ] ] let Γ 0 in X ¯ ( [ [ t : T ] ] T T )  let  Γ 0  in  [ [ ( t : X ¯ . T ) : T ] ]  let  Γ 0  in  X ¯ [ [ t : T ] ] T T " let "Gamma_(0)" in "[[(t:EE bar(X).T):T^(')]]-=" let "Gamma_(0)" in "EE bar(X)*(([[)t:T(]])^^T <= T^('))\text { let } \Gamma_{0} \text { in } \llbracket(\mathrm{t}: \exists \overline{\mathrm{X}} . \mathrm{T}): \mathrm{T}^{\prime} \rrbracket \equiv \text { let } \Gamma_{0} \text { in } \exists \overline{\mathrm{X}} \cdot\left(\llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{T}^{\prime}\right) let Γ0 in [[(t:X¯.T):T]] let Γ0 in X¯([[t:T]]TT)
So far, expressions cannot have free type variables, so the hypothesis x ¯ x ¯ bar(x)\overline{\mathrm{x}}x¯ # f t v ( t ) f t v ( t ) ftv(t)f t v(\mathrm{t})ftv(t) may seem superfluous. However, we shall soon allow expressions to contain type annotations with free type variables, so we prefer to make this condition explicit now. According to this rule, the effect of the type annotation is to force the expression t t ttt to have type T T T\mathrm{T}T, for some choice of the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. As usual in type systems with subtyping, the expression's final type T T T^(')\mathrm{T}^{\prime}T may then be an arbitrary supertype of this particular instance of T T T\mathrm{T}T. When subtyping is interpreted as equality, T T T^(')\mathrm{T}^{\prime}T and T T T\mathrm{T}T are equated by the constraint, so this constraint generation rule may be read: a valid type for ( t : X ¯ T t : X ¯ T t:EE bar(X)T\mathrm{t}: \exists \overline{\mathrm{X}} \mathrm{T}t:X¯T ) must be of the form T T T\mathrm{T}T, for some choice of the type variables x ¯ x ¯ bar(x)\overline{\mathrm{x}}x¯.
1.9.27 Example: In DM extended with integers, the expression ( λ z . z : λ z . z : lambda z.z:\lambda z . z:λz.z: int rarr\rightarrow int) has most general type int rarr\rightarrow int, even though the underlying identity function has most general type X . X X X . X X AAX.XrarrX\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}X.XX, so the annotation restricts its type. The expression ( λ z . z + ^ 1 ^ : X . X X ) ( λ z . z + ^ 1 ^ : X . X X ) (lambda z.z hat(+) hat(1):EE X.X rarr X)(\lambda z . z \hat{+} \hat{1}: \exists X . X \rightarrow X)(λz.z+^1^:X.XX) has type int rarr\rightarrow int, which is also the most general type of the underlying function, so the annotation acts merely as documentation in this case. Note that the type variable X X X\mathrm{X}X is instantiated to int by the constraint solver. The expression ( λ z . ( z 1 ^ ) : X . X ( λ z . ( z 1 ^ ) : X . X (lambda z.(z hat(1)):EE X.X rarr(\lambda z .(z \hat{1}): \exists X . X \rightarrow(λz.(z1^):X.X int) has type (int rarr\rightarrow int) rarr\rightarrow int because the underlying function has type (int Y ) Y Y ) Y rarrY)rarrY\rightarrow \mathrm{Y}) \rightarrow \mathrm{Y}Y)Y, which successfully unifies with X X Xrarr\mathrm{X} \rightarrowX int by instantiating X X X\mathrm{X}X to int rarr\rightarrow int and Y Y Y\mathrm{Y}Y to int. Last, the expression ( λ z ( z 1 ^ ) : X λ z ( z 1 ^ ) : X lambdaz*(z hat(1)):EEX\lambda \mathrm{z} \cdot(\mathrm{z} \hat{1}): \exists \mathrm{X}λz(z1^):X.int X X rarrX\rightarrow \mathrm{X}X ) is ill-typedeven though the underlying expression is well-typed-because the equation (int Y ) Y = Y ) Y = rarrY)rarrY=\rightarrow \mathrm{Y}) \rightarrow \mathrm{Y}=Y)Y= int X X rarrX\rightarrow \mathrm{X}X is unsatisfiable.
1.9.28 ExAmple: In DM extended with pairs, the expression λ z 1 λ z 2 ( ( z 1 : λ z 1 λ z 2 z 1 : lambdaz_(1)*lambdaz_(2)*((z_(1)::}\lambda z_{1} \cdot \lambda z_{2} \cdot\left(\left(z_{1}:\right.\right.λz1λz2((z1: X . X ) , ( z 2 : X . X ) ) X . X ) , z 2 : X . X {: EE X.X),(z_(2):EE X.X))\left.\exists X . X),\left(z_{2}: \exists X . X\right)\right)X.X),(z2:X.X)) has most general type X Y . X Y X × Y X Y . X Y X × Y AA XY.X rarr Y rarr X xx Y\forall X Y . X \rightarrow Y \rightarrow X \times YXY.XYX×Y. In other words, the two occurrences of X X XXX do not represent the same type. Indeed, one could just as well have written λ z 1 λ z 2 ( ( z 1 : X . X ) , ( z 2 : Y . Y ) ) λ z 1 λ z 2 z 1 : X . X , z 2 : Y . Y lambdaz_(1)*lambdaz_(2)*((z_(1):EE X.X),(z_(2):EE Y.Y))\lambda z_{1} \cdot \lambda z_{2} \cdot\left(\left(z_{1}: \exists X . X\right),\left(z_{2}: \exists Y . Y\right)\right)λz1λz2((z1:X.X),(z2:Y.Y)). If one wishes z 1 z 1 z_(1)z_{1}z1 and z 2 z 2 z_(2)z_{2}z2 to receive the same type, one must lift the type annotations and merge them above the pair constructor, as follows: λ z 1 λ z 2 ( ( z 1 , z 2 ) : X . X × X ) λ z 1 λ z 2 z 1 , z 2 : X . X × X lambdaz_(1)*lambdaz_(2)*((z_(1),z_(2)):EE X.X xx X)\lambda z_{1} \cdot \lambda z_{2} \cdot\left(\left(z_{1}, z_{2}\right): \exists X . X \times X\right)λz1λz2((z1,z2):X.X×X). In the process, the type constructor x x xxx has appeared in the annotation, causing its size to increase.
The above example reveals a limitation of this style of type annotations: by requiring every type annotation to be closed, we lose the ability for two separate annotations to share a type variable. Yet, such a feature is sometimes desirable. If the two annotations where sharing is desired are distant in the code, it may be awkward to lift and merge them into a single annotation; so, more expressive power is sometimes truly needed.
Thus, we are lead to consider more general type annotations, of the form ( t : T ) ( t : T ) (t:T)(\mathrm{t}: \mathrm{T})(t:T), where T T T\mathrm{T}T has kind ***\star, and where the type variables that appear within T T T\mathrm{T}T are considered free, so that distinct type annotations may refer to shared type variables. For this idea to make sense, however, it is still necessary to specify where these type variables are bound. We do so using expressions of the form X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯.t. Such an expression binds the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ within the expression t t ttt, so that all free occurrences of X X XXX (where X X ¯ X X ¯ X in bar(X)X \in \bar{X}XX¯ ) in type annotations inside t t ttt stand for the same type. Thus, we break the simple type annotation construct ( : X ¯ ( : X ¯ (*:EE bar(X)(\cdot: \exists \overline{\mathrm{X}}(:X¯.T) into two more elementary constituents, namely existential type variable introduction X ¯ X ¯ EE bar(X)*\exists \overline{\mathrm{X}} \cdotX¯ and type constraint ( : T ) ( : T ) (*:T)(\cdot: \mathrm{T})(:T). Note that both are new forms of expressions; neither can be encoded by adding new constants to the calculus, because it is not possible to assign closed type schemes to them.
Technically, allowing expressions to contain type variables requires some care. Several constraint generation rules employ auxiliary type variables, which become bound in the generated constraint. These type variables may be chosen in an arbitrary way, provided they do not appear free in the rule's left-hand side - a side-condition intended to avoid inadvertent capture. So far, this side-condition could be read: the auxiliary type variables used to form the constraint [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] must not appear free within T T T\mathrm{T}T. Now, since type annotations may contain free type variables, the side-condition becomes: the auxiliary type variables used to form [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket \mathrm{t}: \mathrm{T} \rrbracket[[t:T]] must not appear free within t t t\mathrm{t}t or T T T\mathrm{T}T.
With this extended side-condition in mind, our original constraint generation rules remain unchanged. We add two new rules to describe how the new expression forms affect constraint generation:
[ [ X ¯ . t : T ] ] = X ¯ [ [ t : T ] ] provided X ¯ # f t v ( T ) [ [ ( t : T ) : T ] ] = [ [ t : T ] ] T T [ [ X ¯ . t : T ] ] = X ¯ [ [ t : T ] ]  provided  X ¯ # f t v ( T ) [ [ ( t : T ) : T ] ] = [ [ t : T ] ] T T {:[[[EE bar(X).t:T]]=EE bar(X)*[[t:T]]" provided " bar(X)#ftv(T)],[[[(t:T):T^(')]]=[[t:T]]^^T <= T^(')]:}\begin{aligned} \llbracket \exists \overline{\mathrm{X}} . \mathrm{t}: \mathrm{T} \rrbracket & =\exists \overline{\mathrm{X}} \cdot \llbracket \mathrm{t}: \mathrm{T} \rrbracket & \text { provided } \overline{\mathrm{X}} \# \mathrm{ftv}(\mathrm{T}) \\ \llbracket(\mathrm{t}: \mathrm{T}): \mathrm{T}^{\prime} \rrbracket & =\llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{T}^{\prime} & \end{aligned}[[X¯.t:T]]=X¯[[t:T]] provided X¯#ftv(T)[[(t:T):T]]=[[t:T]]TT
The effect of these rules is simple. The construct x ¯ x ¯ EE bar(x)\exists \bar{x}x¯.t is an indication to the constraint generator that the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, which may occur free within type annotations inside t t ttt, should be existentially bound at this point. The side-condition X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ # f t v ( T ) f t v ( T ) ftv(T)f t v(\mathrm{~T})ftv( T) ensures that quantifying over X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ in the generated constraint does not capture type variables in the expected type T. It can always be satisfied by α α alpha\alphaα-conversion of the expression X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯.t. The construct ( t : T t : T t:T\mathrm{t}: \mathrm{T}t:T ) is an indication to the constraint generator that the expression t t ttt should have type T T T\mathrm{T}T, and it is treated as such by generating the subconstraint [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]]. The expression's type may be an arbitrary supertype of T T T\mathrm{T}T, hence the auxiliary constraint T T T T T <= T^(')\mathrm{T} \leq \mathrm{T}^{\prime}TT.
1.9.29 EXAMPlE: In DM extended with pairs, the expression λ z 1 λ z 2 x . ( ( z 1 λ z 1 λ z 2 x . z 1 lambdaz_(1)*lambdaz_(2)*EE x.((z_(1):}\lambda z_{1} \cdot \lambda z_{2} \cdot \exists x .\left(\left(z_{1}\right.\right.λz1λz2x.((z1 : X ) , ( z 2 : X ) ) X ) , z 2 : X {:X),(z_(2):X))\left.\mathrm{X}),\left(\mathrm{z}_{2}: \mathrm{X}\right)\right)X),(z2:X)) has most general type X . X X X × X X . X X X × X AAX.XrarrXrarrXxxX\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X} \rightarrow \mathrm{X} \times \mathrm{X}X.XXX×X. Indeed, the constraint generated for this expression contains the pattern x . ( [ [ z 1 : x ] ] [ [ z 2 : x ] ] ) x . [ [ z 1 : x ] ] [ [ z 2 : x ] ] EE x.(([[)z_(1):x(]])^^([[)z_(2):x(]])^^dots)\exists x .\left(\llbracket z_{1}: x \rrbracket \wedge \llbracket z_{2}: x \rrbracket \wedge \ldots\right)x.([[z1:x]][[z2:x]]), which causes z 1 z 1 z_(1)z_{1}z1 and z 2 z 2 z_(2)z_{2}z2 to receive the same type. Note that this style is more flexible than that employed in Example 1.9.28, where we were forced to use a single, monolithic type annotation to express this sharing constraint.
1.9.30 REMARK: In practice, a type variable is usually represented as a memory cell in the typechecker's heap. So, one cannot say that the source code contains type variables; rather, it contains names that are meant to stand for type variables. Let us write X X XXX for such a name, and T T TTT for a type made of type constructors and names, rather than of type constructors and type variables. Then, our new expression forms are really X ¯ X ¯ EE bar(X)\exists \bar{X}X¯.t and ( t : T t : T t:T\mathrm{t}: Tt:T ). When the constraint generator enters the scope of an introduction form X ¯ X ¯ EE bar(X)\exists \bar{X}X¯.t, it allocates a vector of fresh type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, and augments an internal environment with the bindings X ¯ X ¯ X ¯ X ¯ bar(X)|-> bar(X)\bar{X} \mapsto \overline{\mathrm{X}}X¯X¯. Because the type variables are fresh, the side-condition of the first constraint generation rule above is automatically satisfied. When the constraint generator finds a type annotation ( t : T t : T t:Tt: Tt:T ), it looks up the internal environment to translate the type annotation T T TTT into an internal type T T T\mathrm{T}T-which fails if T T TTT contains a name that is not in scope-and applies the second constraint generation rule above.
1.9.31 ExERcise [ , ] [ , ] [******,↛][\star \star, \nrightarrow][,] : Let X ¯ f t v ( T ) X ¯ f t v ( T ) bar(X)supe ftv(T)\overline{\mathrm{X}} \supseteq f t v(\mathrm{~T})X¯ftv( T) and X ¯ # f t v ( t ) X ¯ # f t v ( t ) bar(X)#ftv(t)\overline{\mathrm{X}} \# f t v(\mathrm{t})X¯#ftv(t). Check that the constraints [ [ ( t : X ¯ . T ) : T ] ] [ [ ( t : X ¯ . T ) : T ] ] [[(t:EE bar(X).T):T^(')]]\llbracket(\mathrm{t}: \exists \overline{\mathrm{X}} . \mathrm{T}): \mathrm{T}^{\prime} \rrbracket[[(t:X¯.T):T]] and [ [ X ¯ . ( t : T ) : T ] ] [ [ X ¯ . ( t : T ) : T ] ] [[EE bar(X).(t:T):T^(')]]\llbracket \exists \overline{\mathrm{X}} .(\mathrm{t}: \mathrm{T}): \mathrm{T}^{\prime} \rrbracket[[X¯.(t:T):T]] are equivalent. In other words, the local type annotations introduced earlier may be expressed in terms of the more complex constructs described above.
1.9.32 EXERCISE [ , ] [ , ] [******,↛][\star \star, \nrightarrow][,] : One way of giving identity semantics to our new type annotation constructs is to erase them altogether prior to execution. Give an inductive definition of t t |__ t __|\lfloor t\rfloort, the expression obtained by removing all type annotation constructs from the expression t t ttt. Check that [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]] entails [ [ t [ [ t [[|__ t __|\llbracket\lfloor t\rfloor[[t : T ] ] T ] ] T]]\mathrm{T} \rrbracketT]] and explain why this is sufficient to ensure type soundness.
It is interesting to study how explicit introduction of existentially quantified type variables interacts with let-polymorphism. The source of their interaction lies in the difference between the constraints let z : x ¯ [ X . C 1 ] . T z : x ¯ X . C 1 . T z:AA bar(x)[EEX.C_(1)].T\mathrm{z}: \forall \overline{\mathrm{x}}\left[\exists \mathrm{X} . C_{1}\right] . \mathrm{T}z:x¯[X.C1].T in C 2 C 2 C_(2)C_{2}C2 and EE\exists x.let z : x ¯ [ C 1 ] . T z : x ¯ C 1 . T z:AA bar(x)[C_(1)].T\mathrm{z}: \forall \overline{\mathrm{x}}\left[C_{1}\right] . \mathrm{T}z:x¯[C1].T in C 2 C 2 C_(2)C_{2}C2, which was explained in Example 1.3.28. In the former constraint, every free occurrence of z z z\mathrm{z}z inside C 2 C 2 C_(2)C_{2}C2 causes a copy of x . C 1 x . C 1 EEx.C_(1)\exists \mathrm{x} . C_{1}x.C1 to be taken, thus creating its own fresh copy of X. In the latter constraint, on the other hand, every free occurrence of z z z\mathrm{z}z inside C 2 C 2 C_(2)C_{2}C2 produces a copy of C 1 C 1 C_(1)C_{1}C1. All such copies share references to X X X\mathrm{X}X, because its quantifier was not duplicated. In the former case, one may say that the type scheme assigned to z z zzz is polymorphic with respect to X X X\mathrm{X}X, while in the latter case it is monomorphic. As a result, the placement of type variable introduction expressions with respect to let bindings in the source code is meaningful: introducing a type variable outside of a let construct prevents it from being generalized.
1.9.33 ExAmple: In DM extended with integers and Booleans, the program let f = f = f=f=f= X . λ z . ( z : X ) X . λ z . ( z : X ) EE X.lambda z.(z:X)\exists X . \lambda z .(z: X)X.λz.(z:X) in (f 0 , f 0 , f 0,f0, f0,f true ) ) ))) is well-typed. Indeed, the type scheme assigned to f f fff is X . X X X . X X AA X.X rarr X\forall X . X \rightarrow XX.XX. However, the program X X EE X\exists XX.let f = λ z . ( z : X ) f = λ z . ( z : X ) f=lambda z.(z:X)f=\lambda z .(z: X)f=λz.(z:X) in ( f 0 , f f 0 , f f0,ff 0, ff0,f true) is ill-typed. Indeed, the type scheme assigned to f f fff is X X X X X rarr XX \rightarrow XXX; then, no value of X X XXX satisfies the constraints associated with the applications f 0 f 0 f0f 0f0 and f f fff true. The latter behavior is observed in Objective Caml, where type variables are implicitly introduced at the outermost level of expressions:
More details about the treatment of type annotations in Standard ML, Objective Caml, and Haskell are given on page 113.
1.9.34 EXERcisE [ , ] [ , ] [***,↛][\star, \nrightarrow][,] : Determine which constraints are generated for the two programs in Example 1.9.33. Check that the former is indeed well-typed, while the latter is ill-typed.

Recursive types

We have shown that specializing HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) with an equality-only syntactic model yields HM ( = ) HM ( = ) HM(=)\operatorname{HM}(=)HM(=), a constraint-based formulation of Damas and Milner's type system. Similarly, it is possible to specialize HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) with an equalityonly free regular tree model, yielding a constraint-based type system that may be viewed as an extension of Damas and Milner's type discipline with recursive types. This flavor of recursive types is sometimes known as equirecursive, since cyclic equations, such as X = X X X = X X X=XrarrX\mathrm{X}=\mathrm{X} \rightarrow \mathrm{X}X=XX, are then satisfiable. Our theorems about type inference and type soundness, which are independent of the model, remain valid. The constraint solver described in Section 1.8 may be used in
the setting of an equality-only free regular tree model: the only difference with the syntactic case is that the occurs check is no longer performed.
Please note that, although ground types are regular, types remain finite objects: their syntax is unchanged. The μ μ mu\muμ notation commonly employed to describe recursive types may be emulated using type equations: for instance, the notation μ X . X X μ X . X X muX.XrarrX\mu \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}μX.XX corresponds, in our constraint-based approach, to the type scheme x [ X = X X ] x [ X = X X ] AA x[X=X rarr X]\forall x[X=X \rightarrow X]x[X=XX]. X X X\mathrm{X}X.
Although recursive types come for free, as explained above, they have not been adopted in mainstream programming languages based on ML-the-typesystem. The reason is pragmatic: experience shows that many nonsensical expressions are well-typed in the presence of recursive types, whereas they are not in their absence. Thus, the gain in expressiveness is offset by the fact that many programming mistakes are detected later than otherwise possible. Consider, for instance, the following OCaml session:
This nonsensical variant of map is essentially useless, yet well-typed. Its principal type scheme, in our notation, is X Y Z [ Y = X Y Z [ Y = AAXYZ[Y=\forall \mathrm{XYZ}[\mathrm{Y}=XYZ[Y= list Y Z = Y Z = Y^^Z=\mathrm{Y} \wedge \mathrm{Z}=YZ= list Z ] . X Y Z Z ] . X Y Z Z].XrarrYrarrZ\mathrm{Z}] . \mathrm{X} \rightarrow \mathrm{Y} \rightarrow \mathrm{Z}Z].XYZ. In the absence of recursive types, it is ill-typed, since the constraint Y = Y = Y=Y=Y= list Y Z = Y Z = Y^^Z=\mathrm{Y} \wedge \mathrm{Z}=YZ= list Z Z Z\mathrm{Z}Z is then false.
The need for equirecursive types is usually suppressed by the presence of algebraic data types, which offer isorecursive types, in the language. Yet, they are still necessary in some situations, such as in Objective Caml's objectoriented extension (Rémy and Vouillon, 1998), where recursive object types are commonly inferred. In order to allow recursive object types while still rejecting the above variant of map, Objective Caml's constraint solver implements a selective occurs check, which forbids cycles unless they involve the type constructor (:*:)\langle\cdot\rangle associated with objects. The corresponding model is a tree model where every infinite path down a tree must encounter the type constructor (:*:)\langle\cdot\rangle infinitely often.

1.10 Universal quantification in constraints

The constraint logic studied so far allows a set of variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ to be existentially quantified within a formula C C CCC. The resulting formula x ¯ x ¯ EE bar(x)\exists \overline{\mathrm{x}}x¯. C C CCC receives its standard meaning: it requires C C CCC to hold for some X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. However, we currently have no way of requiring a formula C C CCC to hold for all X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. Is it possible to extend our
logic with universal quantification? If so, what are the new possibilities offered by this extension, in terms of type inference? The present section proposes some answers to these questions.
It is worth noting that, although the standard notation for type schemes involves the symbol AA\forall, type scheme introduction and instantiation constraints do not allow an encoding of universal quantification. Indeed, a universal quantifier in a type scheme is very much like an existential quantifier in a constraint: this is suggested, for instance, by Definition 1.3.3 and by C-LETEx.

Constraints

We extend the syntax of constraints as follows:
C ::= x ¯ . C C ::= x ¯ . C C::=dots∣AA bar(x).CC::=\ldots \mid \forall \overline{\mathrm{x}} . CC::=x¯.C
Universally quantified variables are often referred to as rigid, while existentially quantified variables are known as flexible. The logical interpretation of constraints (Figure 1-5) is extended as follows:
t ϕ [ x t ] def Γ in C (CM-ForALL) x ¯ # f t v ( Γ ) def Γ in x ¯ . C t ϕ [ x t ] def Γ  in  C (CM-ForALL) x ¯ # f t v ( Γ ) def Γ  in  x ¯ . C {:[AA vec(t)quad phi[ vec(x)|-> vec(t)]|--def Gamma" in "C],[(CM-ForALL) bar(x)#ftv(Gamma)],[|--def Gamma" in "AA bar(x).C]:}\begin{gather*} \forall \vec{t} \quad \phi[\overrightarrow{\mathrm{x}} \mapsto \vec{t}] \vdash \operatorname{def} \Gamma \text { in } C \\ \overline{\mathrm{x}} \# f t v(\Gamma) \tag{CM-ForALL}\\ \hline \vdash \operatorname{def} \Gamma \text { in } \forall \overline{\mathrm{x}} . C \end{gather*}tϕ[xt]defΓ in C(CM-ForALL)x¯#ftv(Γ)defΓ in x¯.C
We let the reader check that none of the results established in Section 1.3 are affected by this addition. Furthermore, the extended constraint language enjoys the following properties.
1.10.1 Lemma: x ¯ . C C x ¯ . C C AA bar(x).C⊩C\forall \overline{\mathrm{x}} . C \Vdash Cx¯.CC. Conversely, x ¯ # ftv ( C ) x ¯ # ftv ( C ) bar(x)#ftv(C)\overline{\mathrm{x}} \# \operatorname{ftv}(C)x¯#ftv(C) implies C x ¯ . C C x ¯ . C C⊩AA bar(x).CC \Vdash \forall \overline{\mathrm{x}} . CCx¯.C.
1.10.2 Lemma: X ¯ # ftv ( C 2 ) X ¯ # ftv C 2 bar(X)#ftv(C_(2))\overline{\mathrm{X}} \# \operatorname{ftv}\left(C_{2}\right)X¯#ftv(C2) implies x ¯ . ( C 1 C 2 ) ( x ¯ . C 1 ) C 2 x ¯ . C 1 C 2 x ¯ . C 1 C 2 AA bar(x).(C_(1)^^C_(2))-=(AA bar(x).C_(1))^^C_(2)\forall \overline{\mathrm{x}} .\left(C_{1} \wedge C_{2}\right) \equiv\left(\forall \overline{\mathrm{x}} . C_{1}\right) \wedge C_{2}x¯.(C1C2)(x¯.C1)C2.
1.10.3 Lemma: X ¯ . Y ¯ . C X ¯ Y ¯ . C X ¯ . Y ¯ . C X ¯ Y ¯ . C AA bar(X).AA bar(Y).C-=AA bar(X) bar(Y).C\forall \overline{\mathrm{X}} . \forall \overline{\mathrm{Y}} . C \equiv \forall \overline{\mathrm{X}} \overline{\mathrm{Y}} . CX¯.Y¯.CX¯Y¯.C.
1.10.4 Lemma: Let x ¯ # Y ¯ x ¯ # Y ¯ bar(x)# bar(Y)\overline{\mathrm{x}} \# \overline{\mathrm{Y}}x¯#Y¯. Then, x ¯ . Y ¯ . C x ¯ . Y ¯ . C EE bar(x).AA bar(Y).C\exists \overline{\mathrm{x}} . \forall \overline{\mathrm{Y}} . Cx¯.Y¯.C entails Y ¯ . X ¯ . C Y ¯ . X ¯ . C AA bar(Y).EE bar(X).C\forall \overline{\mathrm{Y}} . \exists \overline{\mathrm{X}} . CY¯.X¯.C. Conversely, if Y ¯ . C Y ¯ . C EE bar(Y).C\exists \overline{\mathrm{Y}} . CY¯.C determines X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, then Y ¯ . X ¯ . C Y ¯ . X ¯ . C AA bar(Y).EE bar(X).C\forall \overline{\mathrm{Y}} . \exists \overline{\mathrm{X}} . CY¯.X¯.C entails X ¯ . Y ¯ . C X ¯ . Y ¯ . C EE bar(X).AA bar(Y).C\exists \overline{\mathrm{X}} . \forall \overline{\mathrm{Y}} . CX¯.Y¯.C.

Constraint solving

We briefly explain how to extend the constraint solver described in Section 1.8 with support for universal quantification. (Thus, we again assume an equalityonly free tree model.) Constraint solving in the presence of equations and of existential and universal quantifiers is known as unification under a mixed prefix. It is a particular case of the decision problem for the first-order theory of equality on trees; see e.g. (Comon and Lescanne, 1989). Extending our solver is straightforward: in fact, the treatment of universal quantifiers turns
S ; U ; x ¯ . C S [ x ¯ . ] ; U ; C if x ¯ # f t v ( U ) S [ X ¯ . Y ¯ Z ¯ . [ ] ] ; U ; true S [ Y ¯ . X ¯ . Z ¯ . [ ] ] ; U ; true if X ¯ # Y ¯ X ¯ Z ¯ U determines Y ¯ S [ x ¯ X . Y ¯ . [ ] ] ; U ; true false if X Y ¯ X U Z Z X Y ¯ S [ X ¯ X . Y ¯ . ] ; X = T = ϵ U ; true false if X Y ¯ T V S [ X ¯ . Y ¯ . ] ; U 1 U 2 ; true S ; U 1 ; true if X ¯ Y ¯ # ftv ( U 1 ) Y ¯ U 2 true S ; U ; x ¯ . C S [ x ¯ . ] ; U ; C  if  x ¯ # f t v ( U ) S [ X ¯ . Y ¯ Z ¯ . [ ] ] ; U ;  true  S [ Y ¯ . X ¯ . Z ¯ . [ ] ] ; U ;  true   if  X ¯ # Y ¯ X ¯ Z ¯ U  determines  Y ¯ S [ x ¯ X . Y ¯ . [ ] ] ; U ;  true   false   if  X Y ¯ X U Z Z X Y ¯ S [ X ¯ X . Y ¯ . ] ; X = T = ϵ U ;  true   false   if  X Y ¯ T V S [ X ¯ . Y ¯ . ] ; U 1 U 2 ;  true  S ; U 1 ;  true   if  X ¯ Y ¯ # ftv U 1 Y ¯ U 2  true  {:[S;U;AA bar(x).C quad rarrquad S[AA bar(x).◻];U;C],[" if " bar(x)#ftv(U)],[S[AA bar(X).EE bar(Y) bar(Z).[]];U;" true "quad rarrquad S[EE bar(Y).AA bar(X).EE bar(Z).[]];U;" true "],[" if " bar(X)# bar(Y)^^EE bar(X) bar(Z)*U" determines " bar(Y)],[S[AA bar(x)X.EE bar(Y).[]];U;" true "rarr" false "],[" if "X!in bar(Y)^^X-<_(U)^(***)Z^^Z!inX bar(Y)],[S[AA bar(X)X.EE bar(Y).◻];X=T=epsilon^^U;" true "rarr" false "],[" if "X!in bar(Y)^^T!inV],[S[AA bar(X).EE bar(Y).◻];U_(1)^^U_(2);" true "rarr S;U_(1);" true "],[" if " bar(X) bar(Y)#ftv(U_(1))^^EE bar(Y)*U_(2)-=" true "]:}\begin{aligned} & S ; U ; \forall \overline{\mathrm{x}} . C \quad \rightarrow \quad S[\forall \overline{\mathrm{x}} . \square] ; U ; C \\ & \text { if } \overline{\mathrm{x}} \# f t v(U) \\ & S[\forall \overline{\mathrm{X}} . \exists \overline{\mathrm{Y}} \overline{\mathrm{Z}} .[]] ; U ; \text { true } \quad \rightarrow \quad S[\exists \overline{\mathrm{Y}} . \forall \overline{\mathrm{X}} . \exists \overline{\mathrm{Z}} .[]] ; U ; \text { true } \\ & \text { if } \overline{\mathrm{X}} \# \overline{\mathrm{Y}} \wedge \exists \overline{\mathrm{X}} \overline{\mathrm{Z}} \cdot U \text { determines } \overline{\mathrm{Y}} \\ & S[\forall \overline{\mathrm{x}} \mathrm{X} . \exists \overline{\mathrm{Y}} .[]] ; U ; \text { true } \rightarrow \text { false } \\ & \text { if } \mathrm{X} \notin \overline{\mathrm{Y}} \wedge \mathrm{X} \prec_{U}^{\star} \mathrm{Z} \wedge \mathrm{Z} \notin \mathrm{X} \overline{\mathrm{Y}} \\ & S[\forall \overline{\mathrm{X}} \mathrm{X} . \exists \overline{\mathrm{Y}} . \square] ; \mathrm{X}=\mathrm{T}=\epsilon \wedge U ; \text { true } \rightarrow \text { false } \\ & \text { if } \mathrm{X} \notin \overline{\mathrm{Y}} \wedge \mathrm{T} \notin \mathcal{V} \\ & S[\forall \overline{\mathrm{X}} . \exists \overline{\mathrm{Y}} . \square] ; U_{1} \wedge U_{2} ; \text { true } \rightarrow S ; U_{1} ; \text { true } \\ & \text { if } \overline{\mathrm{X}} \overline{\mathrm{Y}} \# \operatorname{ftv}\left(U_{1}\right) \wedge \exists \overline{\mathrm{Y}} \cdot U_{2} \equiv \text { true } \end{aligned}S;U;x¯.CS[x¯.];U;C if x¯#ftv(U)S[X¯.Y¯Z¯.[]];U; true S[Y¯.X¯.Z¯.[]];U; true  if X¯#Y¯X¯Z¯U determines Y¯S[x¯X.Y¯.[]];U; true  false  if XY¯XUZZXY¯S[X¯X.Y¯.];X=T=ϵU; true  false  if XY¯TVS[X¯.Y¯.];U1U2; true S;U1; true  if X¯Y¯#ftv(U1)Y¯U2 true 

Figure 1-16: Solving universal constraints

out to be surprisingly analogous to that of let constraints. To begin, we extend the syntax of stacks with so-called universal frames:
S ::= S [ x ¯ . ] S ::= S [ x ¯ . ] S::=dots∣S[AA bar(x).◻]S::=\ldots \mid S[\forall \overline{\mathrm{x}} . \square]S::=S[x¯.]
Because existential quantifiers cannot, in general, be hoisted out of universal quantifiers, rules S-Ex-1 to S-Ex-4 now allow floating them up to the nearest enclosing let or universal frame, if any, or to the outermost level, otherwise. Thus, in our machine representation of stacks, where rules S-Ex-1 to S-Ex-4 are applied in an eager fashion, every universal frame carries a list of the type variables that are existentially bound immediately after it, and integer ranks count not only let frames, but also universal frames.
The solver's specification is extended with the rules in Figure 1-16. SSolve-All, a forward rule, discovers a universal constraint and enters it, creating a new universal frame to record its existence. S-ALLEx exploits Lemma 1.10.4 to hoist existential quantifiers out of the universal frame. It is analogous to S-LETALL, and its implementation may rely upon the same procedure (Exercise 1.8.8). The next two rules detect failure conditions. SALL-FAIL-1 states that the constraint X . Y ¯ . U X . Y ¯ . U AAX.EE bar(Y).U\forall \mathrm{X} . \exists \overline{\mathrm{Y}} . UX.Y¯.U is false if the rigid variable X X X\mathrm{X}X is directly or indirectly dominated by a free variable Z Z Z\mathrm{Z}Z. Indeed, the value of X X X\mathrm{X}X is then determined by that of Z Z Z\mathrm{Z}Z-but a universally quantified variable ranges over all values, so this is a contradiction. In such a case, X X X\mathrm{X}X is commonly said to escape its scope. S-ALL-FAIL-2 states that the same constraint is false if X X X\mathrm{X}X is equated with a nonvariable term. Indeed, the value of X X X\mathrm{X}X is then
partially determined, since its head constructor is known, which again contradicts its universal status. Last, S-POP-ALL splits the current unification constraint into two components U 1 U 1 U_(1)U_{1}U1 and U 2 U 2 U_(2)U_{2}U2, where U 1 U 1 U_(1)U_{1}U1 is made up entirely of old variables and U 2 U 2 U_(2)U_{2}U2 constrains young variables only. This decomposition is analogous to that performed by S-POP-LET. Then, it is not difficult to check that X ¯ . Y ¯ X ¯ . Y ¯ AA bar(X).EE bar(Y)\forall \overline{\mathrm{X}} . \exists \overline{\mathrm{Y}}X¯.Y¯. ( U 1 U 2 ) U 1 U 2 (U_(1)^^U_(2))\left(U_{1} \wedge U_{2}\right)(U1U2) is equivalent to U 1 U 1 U_(1)U_{1}U1. So, the universal frame, as well as U 2 U 2 U_(2)U_{2}U2, are discarded, and the solver proceeds by examining whatever remains on top of the stack S S SSS.
It is possible to further extend the treatment of universal frames with two rules analogous to S-COMPRESS and S-UNNAME. In practice, this improves the solver's efficiency, and makes it easier to share code between the treatment of let frames and that of universal frames.
It is interesting to remark that, as far as the underlying unification algorithm is concerned, there is no difference between existentially and universally quantified type variables. The algorithm solves whatever equations are presented to it, without inquiring about the status of their variables. Equations that lead to failure, because a rigid variable escapes its scope or is equated with a nonvariable term, are detected only when the universal frame is exited. A perhaps more common approach is to mark rigid variables as such, allowing the unification algorithm to signal failure as soon as one of the two error conditions is encountered. In this approach, a rigid variable may successfully unify only with itself or with flexible variables fresher than itself. It is often called a Skolem constructor in the literature (Läufer and Odersky, 1994; Shields and Peyton Jones, 2002). An interesting variant of this approach appears in Dowek, Hardin, Kirchner and Pfenning's treatment of (higher-order) unification ( 1995 ; 1998 ) ( 1995 ; 1998 ) (1995;1998)(1995 ; 1998)(1995;1998), where flexible variables are represented as ordinary variables, while rigid variables are encoded using De Bruijn indices.
The properties of our constraint solver are preserved by this extension: it is possible to prove that Lemmas 1.8.9, 1.8.10, and 1.8.11 remain valid.

Type annotations, continued

In Section 1.9, we introduced the expression form ( t : X ¯ . T t : X ¯ . T t:EE bar(X).Tt: \exists \overline{\mathrm{X}} . \mathrm{T}t:X¯.T ), allowing an expression t t ttt to be annotated with a type T T T\mathrm{T}T whose free variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ are locally and existentially bound. It is now natural to introduce the symmetric expression form ( t : X ¯ . T t : X ¯ . T t:AA bar(X).T\mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T}t:X¯.T ), where T T T\mathrm{T}T has kind , X ¯ , X ¯ ***, bar(X)\star, \overline{\mathrm{X}},X¯ is bound within T T T\mathrm{T}T, and X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ contains f t v ( T ) f t v ( T ) ftv(T)f t v(\mathrm{~T})ftv( T), as before. Its constraint generation rule is as follows:
[ [ ( t : X ¯ T ) : T ] ] = X ¯ [ [ t : T ] ] X ¯ ( T T ) provided X ¯ # f t v ( t , T ) [ [ ( t : X ¯ T ) : T ] ] = X ¯ [ [ t : T ] ] X ¯ T T  provided  X ¯ # f t v t , T [[(t:AA bar(X)*T):T^(')]]=AA bar(X)*[[t:T]]^^EE bar(X)*(T <= T^('))quad" provided " bar(X)#ftv(t,T^('))\llbracket(\mathrm{t}: \forall \overline{\mathrm{X}} \cdot \mathrm{T}): \mathrm{T}^{\prime} \rrbracket=\forall \overline{\mathrm{X}} \cdot \llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \exists \overline{\mathrm{X}} \cdot\left(\mathrm{T} \leq \mathrm{T}^{\prime}\right) \quad \text { provided } \overline{\mathrm{X}} \# f t v\left(\mathrm{t}, \mathrm{T}^{\prime}\right)[[(t:X¯T):T]]=X¯[[t:T]]X¯(TT) provided X¯#ftv(t,T)
The first conjunct requires t t ttt to have type T T TTT for all values of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯. Here, the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ are universally bound, as expected. The second conjunct requires
T T T^(')\mathrm{T}^{\prime}T to be some instance of the universal annotation X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯.T. Since T T T^(')\mathrm{T}^{\prime}T is only a monotype, it seems difficult to think of another sensible way of constraining T T T^(')\mathrm{T}^{\prime}T. For this reason, the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ are still existentially bound in the second conjunct. This makes the interpretation of the universal quantifier in type annotations a bit more complex than that of the existential quantifier. For instance, when subtyping is interpreted as equality, the constraint generation rule may be read: a valid type for ( t : X ¯ . T t : X ¯ . T t:AA bar(X).T\mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T}t:X¯.T ) is of the form T T T\mathrm{T}T, for some choice of the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯, provided t t t\mathrm{t}t has type T T T\mathrm{T}T for all choices of X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯.
We remark that ( t : X ¯ . T t : X ¯ . T t:AA bar(X).Tt: \forall \overline{\mathrm{X}} . \mathrm{T}t:X¯.T ) must be a new expression form: it cannot be encoded by adding new constants to the calculus-whereas ( t : X ¯ . T t : X ¯ . T t:EE bar(X).Tt: \exists \bar{X} . Tt:X¯.T ) couldbecause none of the existing constraint generation rules produce universally quantified constraints. Like all type annotations, it has identity semantics.
What is the use of universal type annotations, compared with existential type annotations? When a type variable is existentially bound, the typechecker is free to assign it whatever value makes the program well-typed. As a result, the expressions ( λ z . z + ^ 1 ^ : X . X X ) ( λ z . z + ^ 1 ^ : X . X X ) (lambda z.z hat(+) hat(1):EE X.X rarr X)(\lambda z . z \hat{+} \hat{1}: \exists X . X \rightarrow X)(λz.z+^1^:X.XX) and ( λ z . z : X . X X ) ( λ z . z : X . X X ) (lambda z.z:EE X.X rarr X)(\lambda z . z: \exists X . X \rightarrow X)(λz.z:X.XX) are both welltyped: X X X\mathrm{X}X is assigned int in the former case, and remains undetermined in the latter. However, it is sometimes useful to be able to insist that an expression should be polymorphic. This effect is naturally achieved by using a universally bound type variable. Indeed, ( λ z z + ^ 1 ^ : X . X X ) ( λ z z + ^ 1 ^ : X . X X ) (lambda z*z hat(+) hat(1):AAX.XrarrX)(\lambda z \cdot z \hat{+} \hat{1}: \forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X})(λzz+^1^:X.XX) is ill-typed, because X . ( X = X . ( X = AAX.(X=\forall \mathrm{X} .(\mathrm{X}=X.(X= int) is false, while ( λ z z : X . X X ) λ z z : X . X X ) lambda z*z:AA X.X rarr X)\lambda z \cdot z: \forall X . X \rightarrow X)λzz:X.XX) is well-typed.
1.10.5 EXERCISE [ ] [ ] [***][\star][] : Write down the constraints z [ [ ( λ z z + ^ 1 ^ : X . X X ) : Z ] ] z [ [ ( λ z z + ^ 1 ^ : X . X X ) : Z ] ] EE z*[[(lambda z*z hat(+) hat(1):AAX.XrarrX):Z]]\exists z \cdot \llbracket(\lambda z \cdot z \hat{+} \hat{1}: \forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}): \mathrm{Z} \rrbracketz[[(λzz+^1^:X.XX):Z]] and Z . [ [ ( λ z . z : X . X X ) : z ] ] Z . [ [ ( λ z . z : X . X X ) : z ] ] EE Z.[[(lambda z.z:AA X.X rarr X):z]]\exists Z . \llbracket(\lambda z . z: \forall X . X \rightarrow X): z \rrbracketZ.[[(λz.z:X.XX):z]], which tell whether these expressions are well-typed. Check that the former is false, while the latter is satisfiable.
A universal type annotation, as defined above, is nothing but a (closed) Damas-Milner type scheme. Thus, the new construct ( t : X ¯ . T t : X ¯ . T t:AA bar(X).Tt: \forall \overline{\mathrm{X}} . \mathrm{T}t:X¯.T ) gives us the ability to ensure that the expression t t ttt admits the type scheme X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯.T. This feature is exploited at the module level in ML-the-programming-language, where it is necessary to check that the inferred type for a module component t t ttt is more general than the type scheme S S S\mathrm{S}S that appears in the module's signature. In our view, this process simply consists in ensuring that ( t : S ) ( t : S ) (t:S)(t: S)(t:S) is well-typed.
In Section 1.9, we have pointed out that local (that is, closed) type annotations offer limited expressiveness, because they cannot share type variables. To lift this limitation, we have introduced the expression forms X ¯ X ¯ EE bar(X)\exists \overline{\mathrm{X}}X¯. t t t\mathrm{t}t and ( t : T t : T t:T\mathrm{t}: \mathrm{T}t:T ). The former binds the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ within t t ttt, making them available for use in type annotations, and instructs the constraint generator to existentially quantify them at this point. The latter requires t t ttt to have T T TTT. It is natural to proceed in the same manner in the case of universal type annotations. We now introduce the expression form X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯.t, which also binds X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ within t t ttt, but comes
with a different constraint generation rule:
[ [ x ¯ . t : T ] ] = X ¯ Z [ [ t : Z ] ] X ¯ [ [ t : T ] ] provided X ¯ # f t v ( T ) Z f t v ( t ) [ [ x ¯ . t : T ] ] = X ¯ Z [ [ t : Z ] ] X ¯ [ [ t : T ] ]  provided  X ¯ # f t v ( T ) Z f t v ( t ) [[AA bar(x).t:T]]=AA bar(X)*EEZ*[[t:Z]]^^EE bar(X)*[[t:T]]quad" provided " bar(X)#ftv(T)^^Z!in ftv(t)\llbracket \forall \overline{\mathrm{x}} . \mathrm{t}: \mathrm{T} \rrbracket=\forall \overline{\mathrm{X}} \cdot \exists \mathrm{Z} \cdot \llbracket \mathrm{t}: \mathrm{Z} \rrbracket \wedge \exists \overline{\mathrm{X}} \cdot \llbracket \mathrm{t}: \mathrm{T} \rrbracket \quad \text { provided } \overline{\mathrm{X}} \# f t v(\mathrm{~T}) \wedge \mathrm{Z} \notin f t v(\mathrm{t})[[x¯.t:T]]=X¯Z[[t:Z]]X¯[[t:T]] provided X¯#ftv( T)Zftv(t)
This rule is a bit more complex than that associated with the expression form X ¯ X ¯ EE bar(X)\exists \bar{X}X¯.t. Again, this is due to the fact that we do not wish to overconstrain T. The first exercise below shows that a more naïve version of the rule does not yield the desired behavior. The second exercise shows that this version does. The third exercise clarifies an efficiency concern.
1.10.6 EXERCISE [ ] [ ] [***][\star][] : Assume that [ [ X ¯ . t : T ] ] [ [ X ¯ . t : T ] ] [[AA bar(X).t:T]]\llbracket \forall \overline{\mathrm{X}} . \mathrm{t}: \mathrm{T} \rrbracket[[X¯.t:T]] is defined as X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯. [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket \mathrm{t}: \mathrm{T} \rrbracket[[t:T]], provided X ¯ # ftv ( T ) X ¯ # ftv ( T ) bar(X)#ftv(T)\overline{\mathrm{X}} \# \operatorname{ftv}(\mathrm{T})X¯#ftv(T). Write down the constraint [ [ X . ( λ z z : X X ) : Z ] ] [ [ X . ( λ z z : X X ) : Z ] ] [[AAX.(lambdaz*z:XrarrX):Z]]\llbracket \forall \mathrm{X} .(\lambda \mathrm{z} \cdot \mathrm{z}: \mathrm{X} \rightarrow \mathrm{X}): \mathrm{Z} \rrbracket[[X.(λzz:XX):Z]]. Can you describe its solutions? Does it have the intended meaning?
1.10.7 ExErcise [ ] [ ] [******][\star \star][] : Let x ¯ f t v ( T ) x ¯ f t v ( T ) bar(x)supe ftv(T)\overline{\mathrm{x}} \supseteq f t v(\mathrm{~T})x¯ftv( T) and x ¯ # f t v ( t ) x ¯ # f t v ( t ) bar(x)#ftv(t)\overline{\mathrm{x}} \# f t v(\mathrm{t})x¯#ftv(t). Check that the constraints [ [ ( t : X ¯ . T ) : T ] ] [ [ ( t : X ¯ . T ) : T ] ] [[(t:AA bar(X).T):T^(')]]\llbracket(\mathrm{t}: \forall \overline{\mathrm{X}} . \mathrm{T}): \mathrm{T}^{\prime} \rrbracket[[(t:X¯.T):T]] and [ [ X ¯ . ( t : T ) : T ] ] [ [ X ¯ . ( t : T ) : T ] ] [[AA bar(X).(t:T):T^(')]]\llbracket \forall \overline{\mathrm{X}} .(\mathrm{t}: \mathrm{T}): \mathrm{T}^{\prime} \rrbracket[[X¯.(t:T):T]] are equivalent. In other words, local universal type annotations may also be expressed in terms of the more complex constructs described above.
1.10.8 EXERCISE [ , ] [ , ] [************,↛][\star \star \star \star, \nrightarrow][,] : The constraint generation rule that appears above compromises the linear time and space complexity of constraint generation, because it duplicates the term t t ttt. It is possible to avoid this problem, but this requires a slight generalization of the constraint language. Let us write let x : X _ _ Y ¯ [ C 1 ] . T x : X _ _ Y ¯ C 1 . T x:AAX__ bar(Y)[C_(1)].T\mathrm{x}: \forall \underline{\underline{X}} \bar{Y}\left[C_{1}\right] . \mathrm{T}x:X__Y¯[C1].T in C 2 C 2 C_(2)C_{2}C2 for X ¯ . Y ¯ . C 1 X ¯ . Y ¯ . C 1 AA bar(X).EE bar(Y).C_(1)^^\forall \overline{\mathrm{X}} . \exists \overline{\mathrm{Y}} . C_{1} \wedgeX¯.Y¯.C1 def x : X ¯ Y ¯ [ C 1 ] . T x : X ¯ Y ¯ C 1 . T x:AA bar(X) bar(Y)[C_(1)].T\mathrm{x}: \forall \overline{\mathrm{X}} \overline{\mathrm{Y}}\left[C_{1}\right] . \mathrm{T}x:X¯Y¯[C1].T in C 2 C 2 C_(2)C_{2}C2. In this extended let form, the underlined variables x ¯ x ¯ bar(x)\overline{\mathrm{x}}x¯ are interpreted as rigid, instead of flexible, while checking that C 1 C 1 C_(1)C_{1}C1 is satisfiable. However, the type scheme associated with x x x\mathrm{x}x is not affected. Check that the above constraint generation rule may now be written as follows:
[ [ X ¯ . t : T ] ] = let x : x _ z [ [ [ t : Z ] ] ] . Z in x T provided Z f t v ( t ) [ [ X ¯ . t : T ] ] =  let  x : x _ z [ [ [ t : Z ] ] ] . Z  in  x T  provided  Z f t v ( t ) [[AA bar(X).t:T]]=" let "x:AAx_z[[[t:Z]]].Z" in "x-<=Tquad" provided "Z!in ftv(t)\llbracket \forall \overline{\mathrm{X}} . \mathrm{t}: \mathrm{T} \rrbracket=\text { let } \mathrm{x}: \forall \underline{\mathrm{x}} \mathrm{z}[\llbracket \mathrm{t}: \mathrm{Z} \rrbracket] . \mathrm{Z} \text { in } \mathrm{x} \preceq \mathrm{T} \quad \text { provided } \mathrm{Z} \notin f t v(\mathrm{t})[[X¯.t:T]]= let x:x_z[[[t:Z]]].Z in xT provided Zftv(t)
Roughly speaking, the new rule forms a most general type scheme for t t ttt, ensures that the type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ are unconstrained in it, and checks that T T T\mathrm{T}T is an instance of it. Furthermore, it does not duplicate t. To complete the exercise, extend the specification of the constraint solver (Figures 1-12 and 116), as well as its implementation, to deal with this extension of the constraint language.
To conclude, let us once again stress that, if T T T\mathrm{T}T has free type variables, the effect of the type annotation ( t : T t : T t:T\mathrm{t}: \mathrm{T}t:T ) depends on how and where they are bound. The effect of how stems from the fact that binding a type variable universally, rather than existentially, leads to a stricter constraint. Indeed, we let the reader check that [ [ X ¯ . t : T ] ] [ [ X ¯ . t : T ] ] [[AA bar(X).t:T]]\llbracket \forall \overline{\mathrm{X}} . \mathrm{t}: \mathrm{T} \rrbracket[[X¯.t:T]] entails [ [ X ¯ . t : T ] ] [ [ X ¯ . t : T ] ] [[EE bar(X).t:T]]\llbracket \exists \overline{\mathrm{X}} . \mathrm{t}: \mathrm{T} \rrbracket[[X¯.t:T]], while the converse
does not hold in general. The effect of where has been illustrated, in the case of existentially bound type variables, in Section 1.9. It is due, in that case, to the fact that let and EE\exists do not commute. In the case of universally bound type variables, it may be imputed to the fact that AA\forall and EE\exists do not commute. For instance, λ z . X . ( z : x ) λ z . X . ( z : x ) lambdaz.AAX.(z:x)\lambda \mathrm{z} . \forall \mathrm{X} .(\mathrm{z}: \mathrm{x})λz.X.(z:x) is ill-typed, because inside the λ λ lambda\lambdaλ-abstraction, the program variable z z z\mathrm{z}z cannot be said to have every type. However, X . λ z . ( z : X ) X . λ z . ( z : X ) AAX.lambdaz.(z:X)\forall \mathrm{X} . \lambda \mathrm{z} .(\mathrm{z}: \mathrm{X})X.λz.(z:X) is well-typed, because the identity function does have type X X X X XrarrX\mathrm{X} \rightarrow \mathrm{X}XX for every X X X\mathrm{X}X.
1.10.9 EXERCISE [ ] [ ] [***][\star][] : Write down the constraints Z . [ [ λ z . x . ( z : x ) : z ] ] Z . [ [ λ z . x . ( z : x ) : z ] ] EE Z.[[lambda z.AA x.(z:x):z]]\exists Z . \llbracket \lambda z . \forall x .(z: x): z \rrbracketZ.[[λz.x.(z:x):z]] and Z . [ [ x . λ z . ( z : X ) : Z ] ] Z . [ [ x . λ z . ( z : X ) : Z ] ] EE Z.[[AAx.lambdaz.(z:X):Z]]\exists Z . \llbracket \forall \mathrm{x} . \lambda \mathrm{z} .(\mathrm{z}: \mathrm{X}): \mathrm{Z} \rrbracketZ.[[x.λz.(z:X):Z]], which tell whether these expressions are well-typed. Is the former satisfiable? Is the latter?
In Standard ML and Objective Caml, the type variables that appear in type annotations are implicitly bound. That is, there is no syntax in the language for the constructs X ¯ X ¯ EE bar(X)\exists \bar{X}X¯. t t ttt and x ¯ x ¯ AA bar(x)\forall \bar{x}x¯.t. When a type annotation ( t : T t : T t:Tt: Tt:T ) contains a free type variable X X X\mathrm{X}X, a fixed convention tells how and where X X X\mathrm{X}X is bound. In Standard ML, X X X\mathrm{X}X is universally bound at the nearest val binding that encloses all related occurrences of X (Milner, Tofte, and Harper, 1990). The 1997 revision of Standard ML (Milner, Tofte, Harper, and MacQueen, 1997b) slightly improves on this situation by allowing type variables to be explicitly introduced at val bindings. However, they still must be universally bound. In Objective Caml, X X X\mathrm{X}X is existentially bound at the nearest enclosing toplevel let binding; this behavior seems to be presently undocumented. We argue that (i) allowing type variables to be implicitly introduced is confusing; and (ii) for expressiveness, both universal and existential quantifiers should be made available to programmers. Surprisingly, these language design and type inference issues seem to have received little attention in the literature, although they have most likely been "folklore" for a long time. Peyton Jones and Shields (2003) study these issues in the context of Haskell, and concur with (i). Concerning (ii), they seem to think that the language designer must choose between existential and universal type variable introduction formswhich they refer to as "type-sharing" and "type-lambda"-whereas we point out that they may and should coexist.

Polymorphic recursion

Example 1.2.10 explains how the letrec construct found in ML-theprogramming-language may be viewed as an application of the constant fix, wrapped inside a normal let construct. Exercise 1.9.6 shows that this gives rise to a somewhat restrictive constraint generation rule: generalization occurs only after the application of fix is typechecked. In other words, in letrec f = λ f = λ f=lambdaf=\lambdaf=λ z.t 1 1 _(1)_{1}1 in t 2 t 2 t_(2)t_{2}t2, all occurrences of f f fff within t 1 t 1 t_(1)t_{1}t1 must have the same
(monomorphic) type. This restriction is sometimes a nuisance, and seems unwarranted: if the function that is being defined is polymorphic, it should be possible to use it at different types even inside its own definition. Indeed, Mycroft (1984) extended Damas and Milner's type system with a more liberal treatment of recursion, commonly known as polymorphic recursion. The idea is to only request occurrences of f f fff within t 1 t 1 t_(1)t_{1}t1 to have the same type scheme. Hence, they may have different types, all of which are instances of a common type scheme. It was later shown that well-typedness in Mycroft's extended type system is undecidable (Henglein, 1993; Kfoury, Tiuryn, and Urzyczyn, 1993). To work around this stumbling block, one solution is to use a semialgorithm, falling back to monomorphic recursion if it does not succeed or fail in reasonable time. Although such a solution might be appealing in the setting of an automated program analysis, it is less so in the setting of a programmer-visible type system, because it may become difficult to understand why a program is ill-typed. Thus, we describe a simpler solution, which consists in requiring the programmer to explicitly supply a type scheme for f. This is an instance of a mandatory type annotation.
To begin, we must change the status of fix, because if fix remains a constant, then f f fff must remain λ λ lambda\lambdaλ-bound and cannot receive a polymorphic type scheme. We turn fix into a language construct, which binds a program variable f f fff, and annotates it with a DM type scheme. The syntax of values and expressions is thus extended as follows:
Please note that f f fff is bound within λ λ lambda\lambdaλ z.t. The operational semantics is extended as follows.
(R-FIX') ( f i x f : S . λ z . t ) v ( let f = f i x f : S . λ z . t in λ z . t ) v (R-FIX') ( f i x f : S . λ z . t ) v (  let  f = f i x f : S . λ z . t  in  λ z . t ) v {:(R-FIX')(fixf:S.lambda z.t)v longrightarrow(" let "f=fixf:S.lambda z.t" in "lambda z.t)v:}\begin{equation*} (f i x f: S . \lambda z . t) v \longrightarrow(\text { let } f=f i x f: S . \lambda z . t \text { in } \lambda z . t) v \tag{R-FIX'} \end{equation*}(R-FIX')(fixf:S.λz.t)v( let f=fixf:S.λz.t in λz.t)v
The type annotation S S S\mathrm{S}S plays no essential role in the reduction; it is merely preserved. It is now possible to define letrec f : S = λ z t t 1 f : S = λ z t t 1 f:S=lambdaz_(t)t_(1)\mathrm{f}: \mathrm{S}=\lambda \mathrm{z}_{\mathrm{t}} \mathrm{t}_{1}f:S=λztt1 in t 2 t 2 t_(2)\mathrm{t}_{2}t2 as syntactic sugar for let f = f i x f : S . λ z . t 1 f = f i x f : S . λ z . t 1 f=fixf:S.lambda z.t_(1)f=f i x f: S . \lambda z . t_{1}f=fixf:S.λz.t1 in t 2 t 2 t_(2)t_{2}t2.
We now give a constraint generation rule for fix:
[ [ f i x f : S . λ z . t : T ] ] = let f : S in [ [ λ z . t : S ] ] S T [ [ f i x f : S . λ z . t : T ] ] =  let  f : S  in  [ [ λ z . t : S ] ] S T [[fixf:S.lambda z.t:T]]=" let "f:S" in "[[lambda z.t:S]]^^S-<=T\llbracket f i x f: S . \lambda z . t: T \rrbracket=\text { let } f: S \text { in } \llbracket \lambda z . t: S \rrbracket \wedge S \preceq T[[fixf:S.λz.t:T]]= let f:S in [[λz.t:S]]ST
The left-hand conjunct requires the function λ z λ z lambda z\lambda zλz.t to have type scheme S S SSS, under the assumption that f f fff has type S S SSS. Thus, it is now possible for different occurrences of f f fff within t t ttt to receive different types. If S S SSS is X ¯ X ¯ AA bar(X)\forall \bar{X}X¯. T T TTT, where X ¯ # f t v ( t ) X ¯ # f t v ( t ) bar(X)#ftv(t)\overline{\mathrm{X}} \# \mathrm{ftv}(\mathrm{t})X¯#ftv(t), then we write [ [ t : S ] ] [ [ t : S ] ] [[t:S]]\llbracket t: S \rrbracket[[t:S]] for X ¯ X ¯ AA bar(X)\forall \bar{X}X¯. [ [ t : T ] ] [ [ t : T ] ] [[t:T]]\llbracket t: T \rrbracket[[t:T]]. Indeed, checking the validity of a polymorphic type annotation-be it mandatory, as is the case here, or optional, as was previously the case-requires a universally quantified constraint. The right-hand conjunct merely constrains T T T\mathrm{T}T to be an instance of S S S\mathrm{S}S.
Given the definition of letrec f : S = λ z t 1 f : S = λ z t 1 f:S=lambda z*t_(1)f: S=\lambda z \cdot t_{1}f:S=λzt1 in t 2 t 2 t_(2)t_{2}t2 as syntactic sugar, the above rule leads to the following derived constraint generation rule for letrec:
[ [ [ [ [[\llbracket[[ letrec f : S = λ f : S = λ f:S=lambdaf: S=\lambdaf:S=λ z.t t 1 t 1 t_(1)t_{1}t1 in t 2 : T ] ] = t 2 : T ] ] = t_(2):T]]=t_{2}: T \rrbracket=t2:T]]= let f : S f : S f:Sf: Sf:S in ( [ [ λ z . t 1 : S ] ] [ [ t 2 : T ] ] ) [ [ λ z . t 1 : S ] ] [ [ t 2 : T ] ] (([[)lambda z.t_(1):S(]])^^([[)t_(2):T(]]))\left(\llbracket \lambda z . t_{1}: S \rrbracket \wedge \llbracket t_{2}: T \rrbracket\right)([[λz.t1:S]][[t2:T]])
This rule is arguably quite natural. The program variable f f fff is assigned the type scheme S S SSS throughout its scope, that is, both inside and outside of the function's definition. The function λ λ lambda\lambdaλ z.t 1 1 _(1)_{1}1 must itself have type scheme S S S\mathrm{S}S. Last, t 2 t 2 t_(2)\mathrm{t}_{2}t2 must have type T T T\mathrm{T}T, as in every let construct.
1.10.10 EXERCISE [ ] [ ] [******][\star \star][] : Prove that the derived constraint generation rule above is indeed valid.
It is straightforward to prove that the extended language still enjoys subject reduction. The proof relies on the following lemma: if t t ttt has type scheme S S SSS, then every instance of S S SSS is also a valid type for t t ttt.
1.10.11 Lemma: [ [ t : S ] ] S T [ [ t : T ] ] [ [ t : S ] ] S T [ [ t : T ] ] [[t:S]]^^S-<=T⊩[[t:T]]\llbracket \mathrm{t}: \mathrm{S} \rrbracket \wedge \mathrm{S} \preceq \mathrm{T} \Vdash \llbracket \mathrm{t}: \mathrm{T} \rrbracket[[t:S]]ST[[t:T]].
1.10.12 Theorem [SubJect Reduction]: (R-Fix') ( ) ( ) sube(⊑)\subseteq(\sqsubseteq)().
The programming language Haskell (Hudak, Peyton Jones, Wadler, Boutel, Fairbairn, Fasel, Guzman, Hammond, Hughes, Johnsson, Kieburtz, Nikhil, Partain, and Peterson, 1992) offers polymorphic recursion. Interesting details about its typing rules may be found in (Jones, 1999).
It is worth pointing out that some restricted instances of type inference in the presence of polymorphic recursion are decidable. This is typically the case in certain program analyses, where a type derivation for the program is already available, and the goal is only to infer extra atomic annotations, such as binding time or strictness properties. Several papers that exploit this idea are (Dussart, Henglein, and Mossin, 1995a; Jensen, 1998; Rehof and Fähndrich, 2001).

Universal types

ML-the-type-system enforces a strict stratification between types and type schemes, or, in other words, allows only prenex universal quantifiers inside types. We have pointed out earlier that there is good reason to do so: type inference for ML-the-type-system is decidable, while type inference for System F F F\mathrm{F}F, which has no such restriction, is undecidable. Yet, this restriction comes at a cost in expressiveness: it prevents higher-order functions from accepting polymorphic function arguments, and forbids storing polymorphic functions inside data structures. Fortunately, it is in fact possible to circumvent the problem by requiring the programmer to supply additional type information.
The approach that we are about to describe is reminiscent of the way algebraic data type definitions allow circumventing the problems associated with equirecursive types (Section 1.9). Because we do not wish to extend the syntax of types with universal types of the form Y ¯ Y ¯ AA bar(Y)\forall \bar{Y}Y¯.T, we instead allow universal type definitions, of the form
D X Y ¯ . T D X Y ¯ . T D vec(X)~~AA bar(Y).T\mathrm{D} \overrightarrow{\mathrm{X}} \approx \forall \overline{\mathrm{Y}} . \mathrm{T}DXY¯.T
where D still ranges over data types. If D has signature κ κ vec(kappa)=>***\vec{\kappa} \Rightarrow \starκ, then the type variables X X vec(X)\overrightarrow{\mathrm{X}}X must have kind κ κ vec(kappa)\vec{\kappa}κ. The type T T T\mathrm{T}T must have kind ***\star. The type variables X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ and Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ are considered bound within T T T\mathrm{T}T, and the definition must be closed, that is, f t v ( T ) X ¯ Y ¯ f t v ( T ) X ¯ Y ¯ ftv(T)sube bar(X) bar(Y)f t v(\mathrm{~T}) \subseteq \overline{\mathrm{X}} \overline{\mathrm{Y}}ftv( T)X¯Y¯ must hold. Last, the variance of the type constructor D must match its definition - a requirement stated as follows:
1.10.13 Definition: Let D X Y ¯ D X Y ¯ D vec(X)~~AA bar(Y)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \forall \overline{\mathrm{Y}}DXY¯. T T T\mathrm{T}T and D X Y ¯ T D X Y ¯ T D vec(X)^(')~~AA bar(Y)^(')*T^(')\mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \approx \forall \overline{\mathrm{Y}}^{\prime} \cdot \mathrm{T}^{\prime}DXY¯T be two α α alpha\alphaα-equivalent instances of a single universal type definition, such that Y ¯ # ftv ( T ) Y ¯ # ftv T bar(Y)#ftv(T^('))\overline{\mathrm{Y}} \# \operatorname{ftv}\left(\mathrm{T}^{\prime}\right)Y¯#ftv(T) and Y ¯ # ftv ( T ) Y ¯ # ftv ( T ) bar(Y)^(')#ftv(T)\overline{\mathrm{Y}}^{\prime} \# \operatorname{ftv}(\mathrm{T})Y¯#ftv(T). Then, D X D X Y ¯ . Y ¯ . T T D X D X Y ¯ . Y ¯ . T T D vec(X) <= D vec(X)^(')⊩AA bar(Y)^(').EE bar(Y).T <= T^(')\mathrm{D} \overrightarrow{\mathrm{X}} \leq \mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \Vdash \forall \overline{\mathrm{Y}}^{\prime} . \exists \overline{\mathrm{Y}} . \mathrm{T} \leq \mathrm{T}^{\prime}DXDXY¯.Y¯.TT must hold.
This requirement is analogous to that found in Definition 1.9.8. The idea is, if D X D X D vec(X)D \vec{X}DX and D X D X D vec(X)^(')D \vec{X}^{\prime}DX are comparable, then their unfoldings Y ¯ Y ¯ AA bar(Y)\forall \bar{Y}Y¯. T T TTT and Y ¯ T Y ¯ T AA bar(Y)^(')*T^(')\forall \bar{Y}^{\prime} \cdot T^{\prime}Y¯T should be comparable as well. The comparison between them is expressed by the constraint Y ¯ . Y ¯ . T T Y ¯ . Y ¯ . T T AA bar(Y)^(').EE bar(Y).T <= T^(')\forall \overline{\mathrm{Y}}^{\prime} . \exists \overline{\mathrm{Y}} . \mathrm{T} \leq \mathrm{T}^{\prime}Y¯.Y¯.TT, which may be read: every instance of Y ¯ . T Y ¯ . T AA bar(Y)^(').T^(')\forall \overline{\mathrm{Y}}^{\prime} . \mathrm{T}^{\prime}Y¯.T is (a supertype of) an instance of Y ¯ Y ¯ AA bar(Y)\forall \bar{Y}Y¯.T. Again, when subtyping is interpreted as equality, the requirement of Definition 1.10 .13 is always satisfied; it becomes nontrivial only in the presence of true subtyping.
The effect of the universal type definition D X Y ¯ . T D X Y ¯ . T D vec(X)~~AA bar(Y).TD \vec{X} \approx \forall \bar{Y} . TDXY¯.T is to enrich the programming language with a new construct:
v ::= pack D v t ::= pack D t E ::= pack D E v ::=  pack  D v t ::=  pack  D t E ::=  pack  D E v::=dots∣" pack "_(D)vquadt::=dots∣" pack "_(D)tquadE::=dots∣" pack "_(D)E\mathrm{v}::=\ldots \mid \text { pack }_{\mathrm{D}} \mathrm{v} \quad \mathrm{t}::=\ldots \mid \text { pack }_{\mathrm{D}} \mathrm{t} \quad \mathcal{E}::=\ldots \mid \text { pack }_{\mathrm{D}} \mathcal{E}v::= pack Dvt::= pack DtE::= pack DE
and with a new unary destructor open D D D D D_(D)\mathrm{D}_{\mathrm{D}}DD. Their operational semantics is as follows:
(R-OPEN-ALL) open D ( pack D v ) δ v (R-OPEN-ALL)  open  D pack D v δ v {:(R-OPEN-ALL)" open "_(D)(pack_(D)v)rarr"delta"v:}\begin{equation*} \text { open }_{\mathrm{D}}\left(\operatorname{pack}_{\mathrm{D}} \mathrm{v}\right) \xrightarrow{\delta} \mathrm{v} \tag{R-OPEN-ALL} \end{equation*}(R-OPEN-ALL) open D(packDv)δv
Intuitively, pack D pack D pack_(D)\operatorname{pack}_{D}packD and open D D _(D)_{D}D are the two coercions that witness the isomorphism between D X D X D vec(X)\mathrm{D} \overrightarrow{\mathrm{X}}DX and Y ¯ Y ¯ AA bar(Y)\forall \overline{\mathrm{Y}}Y¯.T. The value p a c k D v p a c k D v pack_(D)v\mathrm{pack}_{\mathrm{D}} \mathrm{v}packDv behaves exactly like v v v\mathrm{v}v, except it is marked, as a hint to the typechecker. As a result, the mark must be removed using open D D D D D_(D)\mathrm{D}_{\mathrm{D}}DD before the value can be used.
What are the typing rules for pack D D _(D)_{D}D and open D D _(D){ }_{D}D ? In System F, they would receive types X ¯ . ( Y ¯ . T ) D X X ¯ . ( Y ¯ . T ) D X AA bar(X).(AA bar(Y).T)rarrD vec(X)\forall \overline{\mathrm{X}} .(\forall \overline{\mathrm{Y}} . \mathrm{T}) \rightarrow \mathrm{D} \overrightarrow{\mathrm{X}}X¯.(Y¯.T)DX and X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯.D X Y ¯ X Y ¯ vec(X)rarr AA bar(Y)\overrightarrow{\mathrm{X}} \rightarrow \forall \overline{\mathrm{Y}}XY¯.T, respectively. However, neither of these is a valid type scheme: both exhibit a universal quantifier under an arrow.
In the case of p a c k D p a c k D pack_(D)\mathrm{pack}_{\mathrm{D}}packD, which has been made a language construct rather than a constant, we work around the problem by embedding this universal
quantifier in the constraint generation rule:
[ [ pack D t : T ] ] = X ¯ ( [ [ t : Y ¯ T ] ] D X T ) [ [  pack  D t : T ] ] = X ¯ [ [ t : Y ¯ T ] ] D X T [[" pack "_(D)t:T^(')]]=EE bar(X)*(([[)t:AA bar(Y)*T(]])^^D vec(X) <= T^('))\llbracket \text { pack }_{\mathrm{D}} \mathrm{t}: \mathrm{T}^{\prime} \rrbracket=\exists \overline{\mathrm{X}} \cdot\left(\llbracket \mathrm{t}: \forall \overline{\mathrm{Y}} \cdot \mathrm{T} \rrbracket \wedge \mathrm{D} \overrightarrow{\mathrm{X}} \leq \mathrm{T}^{\prime}\right)[[ pack Dt:T]]=X¯([[t:Y¯T]]DXT)
The rule implicitly requires that X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ be fresh for the left-hand side and that D X Y ¯ D X Y ¯ D vec(X)~~AA bar(Y)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \forall \overline{\mathrm{Y}}DXY¯. T T T\mathrm{T}T be (an α α alpha\alphaα-variant of) the definition of D D D\mathrm{D}D. The left-hand conjunct requires t t ttt to have type scheme Y ¯ Y ¯ AA bar(Y)\forall \bar{Y}Y¯.T. The notation [ [ t : S ] ] [ [ t : S ] ] [[t:S]]\llbracket t ~: ~ S \rrbracket[[t : S]] was defined on page 114. The right-hand conjunct states that a valid type for p a c k D t p a c k D t pack_(D)t\mathrm{pack}_{\mathrm{D}} \mathrm{t}packDt is (a supertype of) D X X vec(X)\overrightarrow{\mathrm{X}}X.
We deal with open as follows. Provided X ¯ # Y ¯ X ¯ # Y ¯ bar(X)# bar(Y)\overline{\mathrm{X}} \# \overline{\mathrm{Y}}X¯#Y¯, we extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the binding open D D : X ¯ Y . D X T D D : X ¯ Y . D X T D_(D):AA bar(X)Y.D vec(X)rarrT\mathrm{D}_{\mathrm{D}}: \forall \overline{\mathrm{X}} \mathrm{Y} . \mathrm{D} \overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}DD:X¯Y.DXT. We have simply hoisted the universal quantifier outside of the arrow - a valid isomorphism in System F.
The proof of the subject reduction theorem must be extended with the following new case:
1.10.14 Theorem [SUbJect Reduction]: (R-OPEn-AlL) ( ) ( ) sube(⊑)\subseteq(\sqsubseteq)().
Proof: We have
(6) let Γ 0 in [ [ open D ( pack D v ) : T 0 ] ] let Γ 0 in Z . ( open D Z T 0 [ [ pack D v : Z ] ] ) let Γ 0 in Z ( X ¯ Y ¯ ( D X T Z T 0 ) X ¯ ( [ [ v : Y ¯ . T ] ] D X Z ) ) let Γ 0 in X ¯ X ¯ Y ¯ ( [ [ v : Y ¯ T ] ] D X D X T T 0 ) let Γ 0 in X ¯ X ¯ Y ¯ ( [ [ v : Y ¯ T ] ] T T T T 0 ) let Γ 0 in X ¯ Y ¯ X ¯ Y ¯ [ [ v : T 0 ] ] let Γ 0 in [ [ v : T 0 ] ] (6)  let  Γ 0  in  [ [  open  D  pack  D v : T 0 ] ]  let  Γ 0  in  Z .  open  D Z T 0 [ [  pack  D v : Z ] ]  let  Γ 0  in  Z X ¯ Y ¯ D X T Z T 0 X ¯ ( [ [ v : Y ¯ . T ] ] D X Z )  let  Γ 0  in  X ¯ X ¯ Y ¯ [ [ v : Y ¯ T ] ] D X D X T T 0  let  Γ 0  in  X ¯ X ¯ Y ¯ [ [ v : Y ¯ T ] ] T T T T 0  let  Γ 0  in  X ¯ Y ¯ X ¯ Y ¯ [ [ v : T 0 ] ]  let  Γ 0  in  [ [ v : T 0 ] ] {:(6){:[," let "Gamma_(0)" in "[[" open "_(D)(" pack "_(D)v):T_(0)]]],[-=," let "Gamma_(0)" in "EEZ.(" open "_(D)-<=ZrarrT_(0)^^([[)" pack "_(D)v:Z(]]))],[-=," let "Gamma_(0)" in "EEZ*(EE bar(X)^(') bar(Y)^(')*(D vec(X)^(')rarrT^(') <= ZrarrT_(0))^^EE bar(X)*(([[)v:AA bar(Y).T(]])^^D vec(X) <= Z))],[-=," let "Gamma_(0)" in "EE bar(X) bar(X)^(') bar(Y)^(')*(([[)v:AA bar(Y)*T(]])^^D vec(X) <= D vec(X)^(')^^T^(') <= T_(0))],[⊩," let "Gamma_(0)" in "EE bar(X) bar(X)^(') bar(Y)^(')*(([[)v:AA bar(Y)*T(]])^^T <= T^(')^^T^(') <= T_(0))],[⊩," let "Gamma_(0)" in "EE bar(X) bar(Y) bar(X)^(') bar(Y)^(')*[[v:T_(0)]]],[-=," let "Gamma_(0)" in "[[v:T_(0)]]]:}:}\begin{array}{ll} & \text { let } \Gamma_{0} \text { in } \llbracket \text { open }_{\mathrm{D}}\left(\text { pack }_{\mathrm{D}} \mathrm{v}\right): \mathrm{T}_{0} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{Z} .\left(\text { open }_{\mathrm{D}} \preceq \mathrm{Z} \rightarrow \mathrm{T}_{0} \wedge \llbracket \text { pack }_{\mathrm{D}} \mathrm{v}: \mathrm{Z} \rrbracket\right) \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{Z} \cdot\left(\exists \overline{\mathrm{X}}^{\prime} \overline{\mathrm{Y}}^{\prime} \cdot\left(\mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \rightarrow \mathrm{T}^{\prime} \leq \mathrm{Z} \rightarrow \mathrm{T}_{0}\right) \wedge \exists \overline{\mathrm{X}} \cdot(\llbracket \mathrm{v}: \forall \overline{\mathrm{Y}} . \mathrm{T} \rrbracket \wedge \mathrm{D} \overrightarrow{\mathrm{X}} \leq \mathrm{Z})\right) \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \overline{\mathrm{X}} \overline{\mathrm{X}}^{\prime} \overline{\mathrm{Y}}^{\prime} \cdot\left(\llbracket \mathrm{v}: \forall \overline{\mathrm{Y}} \cdot \mathrm{T} \rrbracket \wedge \mathrm{D} \overrightarrow{\mathrm{X}} \leq \mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \wedge \mathrm{T}^{\prime} \leq \mathrm{T}_{0}\right) \\ \Vdash & \text { let } \Gamma_{0} \text { in } \exists \overline{\mathrm{X}} \overline{\mathrm{X}}^{\prime} \overline{\mathrm{Y}}^{\prime} \cdot\left(\llbracket \mathrm{v}: \forall \overline{\mathrm{Y}} \cdot \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{T}^{\prime} \wedge \mathrm{T}^{\prime} \leq \mathrm{T}_{0}\right) \\ \Vdash & \text { let } \Gamma_{0} \text { in } \exists \overline{\mathrm{X}} \overline{\mathrm{Y}} \overline{\mathrm{X}}^{\prime} \overline{\mathrm{Y}}^{\prime} \cdot \llbracket \mathrm{v}: \mathrm{T}_{0} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \llbracket \mathrm{v}: \mathrm{T}_{0} \rrbracket \tag{6} \end{array}(6) let Γ0 in [[ open D( pack Dv):T0]] let Γ0 in Z.( open DZT0[[ pack Dv:Z]]) let Γ0 in Z(X¯Y¯(DXTZT0)X¯([[v:Y¯.T]]DXZ)) let Γ0 in X¯X¯Y¯([[v:Y¯T]]DXDXTT0) let Γ0 in X¯X¯Y¯([[v:Y¯T]]TTTT0) let Γ0 in X¯Y¯X¯Y¯[[v:T0]] let Γ0 in [[v:T0]]
where (1) is by definition of constraint generation for applications and for constants; Z Z Z\mathrm{Z}Z is fresh; (2) is by definition of constraint generation for pack D D _(D)_{\mathrm{D}}D and open D D _(D){ }_{\mathrm{D}}D, where D X Y ¯ D X Y ¯ D vec(X)~~AA bar(Y)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \forall \overline{\mathrm{Y}}DXY¯. T T T\mathrm{T}T and D X Y ¯ . T D X Y ¯ . T D vec(X)^(')~~AA bar(Y)^(').T^(')\mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \approx \forall \overline{\mathrm{Y}}^{\prime} . \mathrm{T}^{\prime}DXY¯.T are two α α alpha\alphaα-equivalent instances of the definition of D ; X ¯ , Y ¯ , X ¯ D ; X ¯ , Y ¯ , X ¯ D; bar(X), bar(Y), bar(X)^(')\mathrm{D} ; \overline{\mathrm{X}}, \overline{\mathrm{Y}}, \overline{\mathrm{X}}^{\prime}D;X¯,Y¯,X¯, and Y ¯ Y ¯ bar(Y)^(')\overline{\mathrm{Y}}^{\prime}Y¯ are fresh and satisfy Y ¯ # f t v ( T ) Y ¯ # f t v T bar(Y)#ftv(T^('))\overline{\mathrm{Y}} \# \mathrm{ftv}\left(\mathrm{T}^{\prime}\right)Y¯#ftv(T) and Y ¯ # ftv ( T ) ; ( 3 ) Y ¯ # ftv ( T ) ; ( 3 ) bar(Y)^(')#ftv(T);(3)\overline{\mathrm{Y}}^{\prime} \# \operatorname{ftv}(\mathrm{T}) ;(3)Y¯#ftv(T);(3) is by C-ExAnd, C-Arrow, and C-ExTrans, which allows eliminating Z; (4) is by Definition 1.10.13, Lemma 1.10.1, and C-ExAnd; (5) is by Lemmas 1.10 .11 and 1.6 .3 ; ( 6 ) 1.6 .3 ; ( 6 ) 1.6.3;(6)1.6 .3 ;(6)1.6.3;(6) is by C-Ex*.
The proof of ( R ( R (R-(\mathrm{R}-(R CONTEXT ) ( ) ) ( ) )sube(⊑)) \subseteq(\sqsubseteq))() must also be extended with a new subcase, corresponding the new production E ::= pack D E E ::= pack D E E::=dots∣pack_(D)E\mathcal{E}::=\ldots \mid \operatorname{pack}_{\mathrm{D}} \mathcal{E}E::=packDE. If the language is pure, this is straightforward. In the presence of side effects, however, this subcase fails, because universal and existential quantifiers in constraints do not commute. The problem is then avoided by restricting pack D D _(D)_{\mathrm{D}}D to values, as in Definition 1.7.7.
This approach to extending ML-the-type-system with universal (or existential - see below) types has been studied in (Läufer and Odersky, 1994;
Rémy, 1994; Odersky and Läufer, 1996; Shields and Peyton Jones, 2002). Laüfer and Odersky have suggested combining universal or existential type declarations with algebraic data type definitions. This allows suppressing the cumbersome pack D D D D D_(D)\mathrm{D}_{\mathrm{D}}DD and open D D D D D_(D)\mathrm{D}_{\mathrm{D}}DD constructs; instead, one simply uses the standard syntax for constructing and deconstructing variants and records.

Existential types

Existential types (TAPL Chapter 24) are close cousins of universal types, and may be introduced into ML-the-type-system in the same manner. Actually, existential types have been introduced in ML-the-type-system before universal types. We give a brief description of this extension, insisting mainly on the differences with the case of universal types.
We now allow existential type definitions, of the form D X Y ¯ D X Y ¯ D vec(X)~~EE bar(Y)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \exists \overline{\mathrm{Y}}DXY¯.T. The conditions required of a well-formed definition are unchanged, except the variance requirement, which is dual:
1.10.15 Definition: Let D X Y ¯ D X Y ¯ D vec(X)~~EE bar(Y)\mathrm{D} \overrightarrow{\mathrm{X}} \approx \exists \overline{\mathrm{Y}}DXY¯. T T T\mathrm{T}T and D X Y ¯ T D X Y ¯ T D vec(X)^(')~~EE bar(Y)^(')*T^(')\mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \approx \exists \overline{\mathrm{Y}}^{\prime} \cdot \mathrm{T}^{\prime}DXY¯T be two α α alpha\alphaα-equivalent instances of a single existential type definition, such that Y ¯ # ftv ( T ) Y ¯ # ftv T bar(Y)#ftv(T^('))\overline{\mathrm{Y}} \# \operatorname{ftv}\left(\mathrm{T}^{\prime}\right)Y¯#ftv(T) and Y ¯ # ftv ( T ) Y ¯ # ftv ( T ) bar(Y)^(')#ftv(T)\overline{\mathrm{Y}}^{\prime} \# \operatorname{ftv}(\mathrm{T})Y¯#ftv(T). Then, D X D X Y ¯ Y ¯ . T T D X D X Y ¯ Y ¯ . T T D vec(X) <= D vec(X)^(')⊩AA bar(Y)*EE bar(Y)^(').T <= T^(')\mathrm{D} \overrightarrow{\mathrm{X}} \leq \mathrm{D} \overrightarrow{\mathrm{X}}^{\prime} \Vdash \forall \overline{\mathrm{Y}} \cdot \exists \overline{\mathrm{Y}}^{\prime} . \mathrm{T} \leq \mathrm{T}^{\prime}DXDXY¯Y¯.TT must hold.
The effect of this existential type definition is to enrich the programming language with a new unary constructor pack D D _(D)_{\mathrm{D}}D and with a new construct: t ::= t ::= t::=dots∣t::=\ldots \midt::= open D t t D t t _(D)tt_{\mathrm{D}} \mathrm{t} tDtt and E ::= | open D E t | open D v E E ::= open D E t open D v E E::=dots|open_(D)Et|open_(D)vE\mathcal{E}::=\ldots\left|\operatorname{open}_{\mathrm{D}} \mathcal{E} \mathrm{t}\right| \operatorname{open}_{\mathrm{D}} \mathrm{v} \mathcal{E}E::=|openDEt|openDvE. Their operational semantics is as follows:
(R-OPEN-Ex) open D ( pack D v 1 ) v 2 v 2 v 1 (R-OPEN-Ex)  open  D  pack  D v 1 v 2 v 2 v 1 {:(R-OPEN-Ex)" open "_(D)(" pack "_(D)v_(1))v_(2)longrightarrowv_(2)v_(1):}\begin{equation*} \text { open }_{\mathrm{D}}\left(\text { pack }_{\mathrm{D}} \mathrm{v}_{1}\right) \mathrm{v}_{2} \longrightarrow \mathrm{v}_{2} \mathrm{v}_{1} \tag{R-OPEN-Ex} \end{equation*}(R-OPEN-Ex) open D( pack Dv1)v2v2v1
In the literature, the second argument of open D D D D D_(D)\mathrm{D}_{\mathrm{D}}DD is often required to be a λ λ lambda\lambdaλ-abstraction λ λ lambda\lambdaλ z.t, so the construct becomes open D t ( λ z . t ) D t ( λ z . t ) _(D)t(lambda z.t)_{D} t(\lambda z . t)Dt(λz.t), often written open D t D t _(D)t_{D} tDt as z z zzz in t t ttt.
Provided X ¯ # Y ¯ X ¯ # Y ¯ bar(X)# bar(Y)\overline{\mathrm{X}} \# \overline{\mathrm{Y}}X¯#Y¯, we extend the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 with the binding pack D : X ¯ Y ¯ . T D X D : X ¯ Y ¯ . T D X _(D):AA bar(X) bar(Y).T rarr D vec(X)_{D}: \forall \bar{X} \bar{Y} . T \rightarrow D \vec{X}D:X¯Y¯.TDX. The constraint generation rule for open D D _(D){ }_{D}D is as follows:
[ [ open D t 1 t 2 : T ] ] = X ¯ . ( [ [ t 1 : D X ] ] [ [ t 2 : Y ¯ . T T ] ] ) [ [  open  D t 1 t 2 : T ] ] = X ¯ . [ [ t 1 : D X ] ] [ [ t 2 : Y ¯ . T T ] ] [[" open "_(D)t_(1)t_(2):T^(')]]=EE bar(X).(([[)t_(1):D vec(X)(]])^^([[)t_(2):AA bar(Y).TrarrT^(')(]]))\llbracket \text { open }_{\mathrm{D}} \mathrm{t}_{1} \mathrm{t}_{2}: \mathrm{T}^{\prime} \rrbracket=\exists \overline{\mathrm{X}} .\left(\llbracket \mathrm{t}_{1}: \mathrm{D} \overrightarrow{\mathrm{X}} \rrbracket \wedge \llbracket \mathrm{t}_{2}: \forall \overline{\mathrm{Y}} . \mathrm{T} \rightarrow \mathrm{T}^{\prime} \rrbracket\right)[[ open Dt1t2:T]]=X¯.([[t1:DX]][[t2:Y¯.TT]])
The rule implicitly requires that X ¯ X ¯ bar(X)\overline{\mathrm{X}}X¯ be fresh for the left-hand side, that Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ be fresh for T T T^(')\mathrm{T}^{\prime}T, and that D X Y ¯ . T D X Y ¯ . T D vec(X)~~AA bar(Y).T\mathrm{D} \overrightarrow{\mathrm{X}} \approx \forall \overline{\mathrm{Y}} . \mathrm{T}DXY¯.T be (an α α alpha\alphaα-variant of) the definition of D D D\mathrm{D}D. The left-hand conjunct simply requires t 1 t 1 t_(1)t_{1}t1 to have type D X D X D vec(X)D \vec{X}DX. The right-hand conjunct states that the function t 2 t 2 t_(2)t_{2}t2 must be prepared to accept an argument of type T T T\mathrm{T}T, for any Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯, and produce a result of the expected type T T T^(')\mathrm{T}^{\prime}T. In other words, t 2 t 2 t_(2)t_{2}t2 must be a polymorphic function.
The type scheme of existential pack D D _(D){ }_{D}D resembles that of universal open D D D D D_(D)\mathrm{D}_{D}DD, while the constraint generation rule for existential pen D D _(D)_{D}D is a close cousin
of that for universal pack D D D D D_(D)\mathrm{D}_{\mathrm{D}}DD. Thus, the duality between universal and existential types is rather strong. The main difference lies in the fact that the existential open D D D D D_(D)\mathrm{D}_{\mathrm{D}}DD construct is binary, rather than unary, so as to limit the scope of the newly introduced type variables Y ¯ Y ¯ bar(Y)\bar{Y}Y¯. The duality may be better understood by studying the encoding of existential types in terms of universal types (Reynolds, 1983b).
As expected, R-OPEn-Ex preserves types.
1.10.16 Theorem [SubJect Reduction]: (R-OpEn-Ex) ( ) ( ) sube(⊑)\subseteq(\sqsubseteq)().
1.10.17 ExERcise [ , ] [ , ] [******,↛][\star \star, \nrightarrow][,] : Prove Theorem 1.10.16. The proof is analogous, although not identical, to that of Theorem 1.10.14.
In the presence of side effects, the new production E ::= E ::= E::=dots∣\mathcal{E}::=\ldots \midE::= open D D _(D)_{D}D v E E E\mathcal{E}E is problematic. The standard workaround is to restrict the second argument to open D D _(D)_{D}D to be a value.

1.11 Rows

In Section 1.9, we have shown how to extend ML-the-programming-language with algebraic data types, that is, variant and record type definitions, which we now refer to as simple. This mechanism has a severe limitation: two distinct definitions must define incompatible types. As a result, one cannot hope to write code that uniformly operates over variants or records of different shapes, because the type of such code is not even expressible.
For instance, it is impossible to express the type of the polymorphic record access operation, which retrieves the value stored at a particular field \ell inside a record, regardless of which other fields are present. Indeed, if the label \ell appears with type T T TTT in the definition of the simple record type D X D X D vec(X)D \vec{X}DX, then the associated record access operation has type X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯.D X T X T vec(X)rarrT\overrightarrow{\mathrm{X}} \rightarrow \mathrm{T}XT. If \ell appears with type T T T^(')\mathrm{T}^{\prime}T in the definition of another simple record type, say D X D X D^(') vec(X)^(')\mathrm{D}^{\prime} \overrightarrow{\mathrm{X}}^{\prime}DX, then the associated record access operation has type X ¯ . D X T X ¯ . D X T AA bar(X)^(').D^(') vec(X)^(')rarrT^(')\forall \overline{\mathrm{X}}^{\prime} . \mathrm{D}^{\prime} \overrightarrow{\mathrm{X}}^{\prime} \rightarrow \mathrm{T}^{\prime}X¯.DXT; and so on. The most precise type scheme that subsumes all of these incomparable type schemes is X Y . X Y X Y . X Y AAXY.XrarrY\forall \mathrm{XY} . \mathrm{X} \rightarrow \mathrm{Y}XY.XY. It is, however, not a sound type scheme for the record access operation. Another powerful operation whose type is currently not expressible is polymorphic record extension, which copies a record and stores a value at field \ell in the copy, possibly creating the field if it did not previously exist, again regardless of which other fields are present. (If \ell was known to previously exist, the operation is known as polymorphic record update.)
In order to assign types to polymorphic record operations, we must do away with record type definitions: we must replace named record types, such as D X X vec(X)\overrightarrow{\mathrm{X}}X, with structural record types that provide a direct description of the record's
domain and contents. (Following the analogy between a record and a partial function from labels to values, we use the word domain to refer to the set of fields that are defined in a record.) For instance, a product type is structural: the type T 1 × T 2 T 1 × T 2 T_(1)xxT_(2)T_{1} \times T_{2}T1×T2 is the (undeclared) type of pairs whose first component has type T 1 T 1 T_(1)\mathrm{T}_{1}T1 and whose second component has type T 2 T 2 T_(2)\mathrm{T}_{2}T2. Thus, we wish to design record types that behave very much like product types. In doing so, we face two orthogonal difficulties. First, as opposed to pairs, records may have different domains. Because the type system must statically ensure that no undefined field is accessed, information about a record's domain must be made part of its type. Second, because we suppress record type definitions, labels must now be predefined. However, for efficiency and modularity reasons, it is impossible to explicitly list every label in existence in every record type.
In what follows, we explain how to address the first difficulty in the simple setting of a finite set of labels. Then, we introduce rows, which allow dealing with an infinite set of labels, and address the second difficulty. We define the syntax and logical interpretation of rows, study the new constraint equivalence laws that arise in their presence, and extend the first-order unification algorithm with support for rows. Then, we review several applications of rows, including polymorphic operations on records, variants, and objects, and discuss alternatives to rows.

Records with finite carrier

Let us temporarily assume that L L L\mathcal{L}L is finite. In fact, for the sake of definiteness, let us assume that L L L\mathcal{L}L is { a , b , c } a , b , c {ℓ_(a),ℓ_(b),ℓ_(c)}\left\{\ell_{a}, \ell_{b}, \ell_{c}\right\}{a,b,c}.
To begin, let us consider only full records, whose domain is exactly L L L\mathcal{L}L-in other words, tuples indexed by L L L\mathcal{L}L. To describe them, it is natural to introduce a type constructor record of signature ***ox***ox***=>***\star \otimes \star \otimes \star \Rightarrow \star. The type record T a T b T c T a T b T c T_(a)T_(b)T_(c)\mathrm{T}_{a} \mathrm{~T}_{b} \mathrm{~T}_{c}Ta Tb Tc represents all records where the field a a ℓ_(a)\ell_{a}a (resp. b , c b , c ℓ_(b),ℓ_(c)\ell_{b}, \ell_{c}b,c ) contains a value of type T a T a T_(a)\mathrm{T}_{a}Ta (resp. T b , T c T b , T c T_(b),T_(c)\mathrm{T}_{b}, \mathrm{~T}_{c}Tb, Tc ). Please note that record is nothing but a product type constructor of arity 3 . The basic operations on records, namely creation of a record out of a default value, which is stored into every field, update of a particular field (say, b b ℓ_(b)\ell_{b}b ), and access to a particular field (say, b b ℓ_(b)\ell_{b}b ), may be assigned the following type schemes:
{ } : X . X record X X X { with b = } : X a X b X b X c record X a X b X c X b record X a X b X c { b } : X a X b X c record X a X b X c X b { } : X . X  record  X X X  with  b = : X a X b X b X c  record  X a X b X c X b  record  X a X b X c b : X a X b X c  record  X a X b X c X b {:[{*}:AAX.Xrarr" record "XXX],[{*" with "ℓ_(b)=*}:AAX_(a)X_(b)X_(b)^(')X_(c)*" record "X_(a)X_(b)X_(c)rarrX_(b)^(')rarr" record "X_(a)X_(b)^(')X_(c)],[*{ℓ_(b)}:AAX_(a)X_(b)X_(c)*" record "X_(a)X_(b)X_(c)rarrX_(b)]:}\begin{aligned} \{\cdot\}: & \forall \mathrm{X} . \mathrm{X} \rightarrow \text { record } \mathrm{X} \mathrm{X} \mathrm{X} \\ \left\{\cdot \text { with } \ell_{b}=\cdot\right\}: & \forall \mathrm{X}_{a} \mathrm{X}_{b} \mathrm{X}_{b}^{\prime} \mathrm{X}_{c} \cdot \text { record } \mathrm{X}_{a} \mathrm{X}_{b} \mathrm{X}_{c} \rightarrow \mathrm{X}_{b}^{\prime} \rightarrow \text { record } \mathrm{X}_{a} \mathrm{X}_{b}^{\prime} \mathrm{X}_{c} \\ \cdot\left\{\ell_{b}\right\}: & \forall \mathrm{X}_{a} \mathrm{X}_{b} \mathrm{X}_{c} \cdot \text { record } \mathrm{X}_{a} \mathrm{X}_{b} \mathrm{X}_{c} \rightarrow \mathrm{X}_{b} \end{aligned}{}:X.X record XXX{ with b=}:XaXbXbXc record XaXbXcXb record XaXbXc{b}:XaXbXc record XaXbXcXb
Here, polymorphism allows updating or accessing a field without knowledge of the types of the other fields. This flexibility is made possible by the property that all record types are formed using a single record type constructor.
This is fine, but in general, the domain of a record is not necessarily L L L\mathcal{L}L : it may be a subset of L L L\mathcal{L}L. How may we deal with this fact, while maintaining the above key property? A naïve approach consists in encoding arbitrary records in terms of full records, using the standard algebraic data type option, whose definition is option X X X~~\mathrm{X} \approxX pre X + X + X+\mathrm{X}+X+ abs. We use pre for present and abs for absent: indeed, a field that is defined with value v v v\mathrm{v}v is encoded as a field with value pre v v v\mathrm{v}v, while an undefined field is encoded as a field with value abs. Thus, an arbitrary record whose fields, if present, have types T a , T b T a , T b T_(a),T_(b)\mathrm{T}_{a}, \mathrm{~T}_{b}Ta, Tb, and T c T c T_(c)\mathrm{T}_{c}Tc, respectively, may be encoded as a full record of type record (option T a T a T_(a)\mathrm{T}_{a}Ta ) (option T b T b T_(b)\mathrm{T}_{b}Tb ) (option T c T c T_(c)\mathrm{T}_{c}Tc ). This naive approach suffers from a serious drawback: record types still contain no domain information. As a result, field access must involve a dynamic check, so as to determine whether the desired field is present: in our encoding, this corresponds to the use of case option .
To avoid this overhead and increase programming safety, we must move this check from runtime to compile time. In other words, we must make the type system aware of the difference between pre and abs. To do so, we replace the definition of option by two separate algebraic data type definitions, namely pre X X X~~\mathrm{X} \approxX pre X X X\mathrm{X}X and abs ~~\approx abs. In other words, we introduce a unary type constructor pre, whose only associated data constructor is pre, and a nullary type constructor abs, whose only associated data constructor is abs. Record types now contain domain information: for instance, a record of type record abs (pre T b T b T_(b)\mathrm{T}_{b}Tb ) (pre T c T c T_(c)\mathrm{T}_{c}Tc ) must have domain { b , c } b , c {ℓ_(b),ℓ_(c)}\left\{\ell_{b}, \ell_{c}\right\}{b,c}. Thus, the type of a field tells whether it is defined. Since the type pre has no data constructors other than pre, the accessor pre 1 1 ^(-1){ }^{-1}1, whose type is X X AAX\forall \mathrm{X}X.pre X X X X XrarrX\mathrm{X} \rightarrow \mathrm{X}XX, and which allows retrieving the value stored in a field, cannot fail. Thus, the dynamic check has been eliminated.
To complete the definition of our encoding, we now define operations on arbitrary records in terms of operations on full records. To distinguish between the two, we write the former with angle braces, instead of curly braces. The empty record \langle\rangle , where all fields are undefined, may be defined as { { {\{{ abs } } }\}}. Extension at a particular field (say, b ) b {:ℓ_(b))(:*:}\left.\ell_{b}\right)\left\langle\cdot\right.b) with b = b = {:ℓ_(b)=*:)\left.\ell_{b}=\cdot\right\rangleb= is defined as λ r . λ z . { r λ r . λ z . r lambda r.lambda z.{r:}\lambda r . \lambda z .\left\{r\right.λr.λz.{r with b = b = ℓ_(b)=\ell_{b}=b= pre z } z {:z}\left.z\right\}z}. Access at a particular field (say, b ) b b b {:ℓ_(b))*(:ℓ_(b):)\left.\ell_{b}\right) \cdot\left\langle\ell_{b}\right\rangleb)b is defined as λ λ lambda\lambdaλ z.pre 1 z . { b } 1 z . b ^(-1)z.{ℓ_(b)}{ }^{-1} z .\left\{\ell_{b}\right\}1z.{b}. It is straightforward to check that these operations have the following principal type schemes:
: record abs abs abs with b = : X a X b X b X c record X a X b X c X b record X a ( pre X b ) X c b : X a X b X c .record X a ( pre X b ) X c X b :  record abs abs abs   with  b = : X a X b X b X c  record  X a X b X c X b record X a  pre  X b X c b : X a X b X c .record  X a  pre  X b X c X b {:[(::):" record abs abs abs "],[(:*" with "ℓ_(b)=*:):AAX_(a)X_(b)X_(b)^(')X_(c)*" record "X_(a)X_(b)X_(c)rarrX_(b)^(')rarr record X_(a)(" pre "X_(b)^('))X_(c)],[*(:ℓ_(b):):AAX_(a)X_(b)X_(c)".record "X_(a)(" pre "X_(b))X_(c)rarrX_(b)]:}\begin{aligned} \langle\rangle: & \text { record abs abs abs } \\ \left\langle\cdot \text { with } \ell_{b}=\cdot\right\rangle: & \forall \mathrm{X}_{a} \mathrm{X}_{b} \mathrm{X}_{b}^{\prime} \mathrm{X}_{c} \cdot \text { record } \mathrm{X}_{a} \mathrm{X}_{b} \mathrm{X}_{c} \rightarrow \mathrm{X}_{b}^{\prime} \rightarrow \operatorname{record} \mathrm{X}_{a}\left(\text { pre } \mathrm{X}_{b}^{\prime}\right) \mathrm{X}_{c} \\ \cdot\left\langle\ell_{b}\right\rangle: & \forall \mathrm{X}_{a} \mathrm{X}_{b} \mathrm{X}_{c} \text {.record } \mathrm{X}_{a}\left(\text { pre } \mathrm{X}_{b}\right) \mathrm{X}_{c} \rightarrow \mathrm{X}_{b} \end{aligned}: record abs abs abs  with b=:XaXbXbXc record XaXbXcXbrecordXa( pre Xb)Xcb:XaXbXc.record Xa( pre Xb)XcXb
It is important to notice that the type schemes associated with extension and access at b b ℓ_(b)\ell_{b}b are polymorphic in X a X a X_(a)\mathrm{X}_{a}Xa and X c X c X_(c)\mathrm{X}_{c}Xc, which now means that these operations are insensitive not only to the type, but also to the presence or
absence of the fields a a ℓ_(a)\ell_{a}a and c c ℓ_(c)\ell_{c}c. Furthermore, extension is polymorphic in X b X b X_(b)\mathrm{X}_{b}Xb, which means that it is insensitive to the presence or absence of the field b b ℓ_(b)\ell_{b}b in its argument. The subterm pre X b X b X_(b)^(')\mathrm{X}_{b}^{\prime}Xb in its result type reflects the fact that b b ℓ_(b)\ell_{b}b is defined in the extended record. Conversely, the subterm pre X b X b X_(b)\mathrm{X}_{b}Xb in the type of the access operation reflects the requirement that b b ℓ_(b)\ell_{b}b be defined in its argument.
Our encoding of arbitrary records in terms of full records was carried out for pedagogical purposes. In practice, no such encoding is necessary: the data constructors pre and abs have no machine representation, and the compiler is free to lay out records in memory in an efficient manner. The encoding is interesting, however, because it provides a natural way of introducing the type constructors pre and abs, which play an important role in our treatment of polymorphic record operations.
We remark that, in our encoding, the arguments of the type constructor record are expected to be either type variables or formed with pre or abs, while, conversely, the type constructors pre and abs are not intended to appear anywhere else. It is possible to enforce this invariant using kinds. In addition to ***\star, let us introduce the kind diamond\diamond of field types. Then, let us adopt the following signatures: pre: ***=>diamond\star \Rightarrow \diamond, abs : diamond\diamond, and record : diamond ox diamond ox diamond=>***\diamond \otimes \diamond \otimes \diamond \Rightarrow \star.
1.11.1 Exercise [ ***\star, Recommended, rarr\rightarrow ]: Check that the three type schemes given above are well-kinded. What is the kind of each type variable?
1.11.2 Exercise [ [ [******[\star \star[, Recommended, ] ] ↛]\nrightarrow]] : Our record types contain information about every field, regardless of whether it is defined: we encode definedness information within the type of each field, using the type constructors pre and abs. A perhaps more natural approach would be to introduce a family of record type constructors, indexed by the subsets of L L L\mathcal{L}L, so that the types of records with different domains are formed with different constructors. For instance, the empty record would have type {} ; a record that defines the field a a ℓ_(a)\ell_{a}a only would have a type of the form { a : T a } a : T a {ℓ_(a):T_(a)}\left\{\ell_{a}: \mathrm{T}_{a}\right\}{a:Ta}; a record that defines the fields b b ℓ_(b)\ell_{b}b and c c ℓ_(c)\ell_{c}c only would have a type of the form { b : T b ; c : T c } b : T b ; c : T c {ℓ_(b):T_(b);ℓ_(c):T_(c)}\left\{\ell_{b}: \mathrm{T}_{b} ; \ell_{c}: \mathrm{T}_{c}\right\}{b:Tb;c:Tc}; and so on. Assuming that the type discipline is Damas and Milner's (that is, assuming an equalityonly syntactic model), would it be possible to assign satisfactory type schemes to polymorphic record access and extension? Would it help to equip record types with a nontrivial subtyping relation?

Records with infinite carrier

Finite records are insufficient both from practical and theoretical points of view. In practice, the set of labels could become very large, making the type of every record as large as the set of labels itself, even if only a few labels are
actually defined. In principle, the set of labels could even be infinite. Actually, in modular programs the whole set of labels may not be known in advance, which amounts in some way to working with an infinite set of labels. Thus, records must be drawn from an infinite set of labels-whether their domains are finite or infinite. Still, we can restrict our attention to records that are almost constant, that is, records where only a finite number of fields differ. With this restriction, full records (defined everywhere) can always be built by giving explicit definitions for a finite number of fields and a default value for all other fields, as in the finite case. For instance, the record { { { { { { {{{\{\{\{{{{ false } } }\}} with = 1 } = 1 } ℓ=1}\ell=1\}=1} with = = ℓ^(')=\ell^{\prime}== true } } }\}} is the record equal to true on field ℓ^(')\ell^{\prime}, to 1 on field \ell, and to false on any other field.
Types of records are functions from labels to types, called rows. However, for sake of generality, we use a unary type constructor, say Π Π Pi\PiΠ, as an indirection between rows and record types. Moreover, we further restrict our attention to the case where rows are also almost constant. (The fact that the property holds for record values does not imply that it also holds for record types, for the default value of some record could have a polymorphic type, and one could wish to see each field with a different instance of this polymorphic type. So this is a true restriction, but a reasonable one.) Thus, rows can also be represented by giving explicit types for a finite number of fields and a default type for all other fields. We write T T delT\partial \mathrm{T}T the row whose type is T T T\mathrm{T}T on every field, and ( : T ; T ) : T ; T (ℓ:T;T^('))\left(\ell: \mathrm{T} ; \mathrm{T}^{\prime}\right)(:T;T) the row whose type is T T T\mathrm{T}T on field \ell and T T T^(')\mathrm{T}^{\prime}T on other fields. Formally, del\partial is a unary type constructor and \ell is a family of binary type constructors, written with syntactic sugar ( : ; ) ( : ; ) (ℓ:*;*)(\ell: \cdot ; \cdot)(:;). For example, Π ( Π Pi(ℓ:}\Pi\left(\ell\right.Π( : bool ; ( (ℓ^('):}\left(\ell^{\prime}\right.( : int ; del\partial bool ) ) ) {:))\left.)\right))) is a record type that describes records whose field \ell carries a value of type bool, field ℓ^(')\ell^{\prime} carries a value of type int, and all other fields carry values of type bool. In fact, this is a sound type for the record defined above. In fact, the type Π ( Π Pi(ℓ^('):}\Pi\left(\ell^{\prime}\right.Π( : int ; ( ( (ℓ(\ell( :bool ; del\partial bool ) ) ) {:))\left.)\right))) should also be a sound type for this record, since the order in which fields are specified should not matter. We actually treat both types as equivalent. Furthermore, the row ( : ( : (ℓ:(\ell:(: bool ; del\partial bool ) ) ))), which stands for bool on field \ell and del\partial bool everywhere else, must also be equivalent to del\partial bool, which stands for bool everywhere.
A record type may also contain type variables. For instance, the record λ z . { z } λ z . { z } lambda z.{z}\lambda z .\{z\}λz.{z} that maps any value v v v\mathrm{v}v to a record with the default value v v v\mathrm{v}v has type X Π ( x ) X Π ( x ) Xrarr Pi(delx)\mathrm{X} \rightarrow \Pi(\partial \mathrm{x})XΠ(x). Projections of this record on any field will return a value of the same type X X X\mathrm{X}X. By comparison, the function that reads some field \ell of its (record) argument has type Π ( : X ; Y ) X Π ( : X ; Y ) X Pi(ℓ:X;Y)rarr X\Pi(\ell: X ; Y) \rightarrow XΠ(:X;Y)X : this says that the argument must be a record where field \ell has type X X X\mathrm{X}X and other fields may have any type. Variable Y Y Y\mathrm{Y}Y is called a row variable, since it can be instantiated to any row. For instance, Y can be instantiated to ( : : (ℓ^(')::}\left(\ell^{\prime}:\right.(: int ; Y ) ; Y {:;Y^('))\left.; \mathrm{Y}^{\prime}\right);Y) and as a result this function can be applied to the record above. Conversely, the row x x delx\partial \mathrm{x}x, which is equal to ( : x ; x ) : x ; x (ℓ^('):x;x)\left(\ell^{\prime}: \mathrm{x} ; \mathrm{x}\right)(:x;x), can
only be instantiated to rows of the form T T delT\partial \mathrm{T}T, which are equal to ( : T ; T ) : T ; T (ℓ^('):T;T)\left(\ell^{\prime}: \mathrm{T} ; \mathrm{T}\right)(:T;T), that is, to constant rows.

Syntax of row types

Let L L L\mathcal{L}L be a denumerable collection of labels. We write . L . L ℓ.L\ell . L.L for { } L { } L {ℓ}⊎L\{\ell\} \uplus L{}L, which implies L L ℓ!in L\ell \notin LL. We first introduce kinds, so as to distinguish rows such as ( \ell : int ; del\partial bool) from basic types, such as int or int rarr\rightarrow int.
1.11.3 DEFINITION [Row KinDs]: Let row kinds be composed of a particular kind Type and the collection of kinds Row ( L ) Row ( L ) Row(L)\operatorname{Row}(L)Row(L) where L L LLL ranges over finite subsets of L L L\mathcal{L}L. We use letter s s sss to range over row kinds.
Intuitively, a row of kind Row ( L ) Row ( L ) Row(L)\operatorname{Row}(L)Row(L) is a function of domain L L L L L\\L\mathcal{L} \backslash LLL to types. That is, L L LLL specifies the set of labels that the row must not define. For instance, the (basic) type Π ( Π ( Pi(ℓ\Pi(\ellΠ( : int ; x ) x ) x)\mathrm{x})x) has kind Type, the row ( ( (ℓ(\ell( : int ; x ) x ) x)\mathrm{x})x) has kind Row ( ) Row ( ) Row(O/)\operatorname{Row}(\emptyset)Row() provided X X X\mathrm{X}X has kind Row ( { } ) Row ( { } ) Row({ℓ})\operatorname{Row}(\{\ell\})Row({}).
To remain abstract the definition of rows is parameterized by a signature S 0 S 0 S_(0)\mathcal{S}_{0}S0 for building basic types and a signature S 1 S 1 S_(1)\mathcal{S}_{1}S1 for building rows. From those two signatures, we then define a new signature S S S\mathcal{S}S that completely specifies the set of types. However, the signature S S S\mathcal{S}S must superimposed row kinds on top of the (basic) kinds of the two input signatures S 0 S 0 S_(0)\mathcal{S}_{0}S0 and S 1 S 1 S_(1)\mathcal{S}_{1}S1. We use product signatures to enlighten this process. More precisely, we build a product signature from two signatures K κ K κ K=>kappaK \Rightarrow \kappaKκ and K κ K κ K^(')=>kappa^(')K^{\prime} \Rightarrow \kappa^{\prime}Kκ with the following notations: we write κ . κ κ . κ kappa.kappa^(')\kappa . \kappa^{\prime}κ.κ for the pair ( κ , κ ) ; K . κ κ , κ ; K . κ (kappa,kappa^('));K.kappa\left(\kappa, \kappa^{\prime}\right) ; K . \kappa(κ,κ);K.κ for the mapping ( d K ( d ) . κ ) d dom ( K ) ; ( K κ ) . κ ( d K ( d ) . κ ) d dom ( K ) ; ( K κ ) . κ (d|->K(d).kappa)^(d in dom(K));(K=>kappa).kappa^(')(d \mapsto K(d) . \kappa)^{d \in \operatorname{dom}(K)} ;(K \Rightarrow \kappa) . \kappa^{\prime}(dK(d).κ)ddom(K);(Kκ).κ for the kind signature K . κ κ . κ K . κ κ . κ K.kappa=>kappa.kappa^(')K . \kappa \Rightarrow \kappa . \kappa^{\prime}K.κκ.κ; and symmetrically, we write κ . K κ . K kappa.K^(')\kappa . K^{\prime}κ.K and κ . ( K κ ) κ . K κ kappa.(K^(')=>kappa^('))\kappa .\left(K^{\prime} \Rightarrow \kappa^{\prime}\right)κ.(Kκ). The signature S S S\mathcal{S}S reuses the same input type constructors as S 0 S 0 S_(0)\mathcal{S}_{0}S0 and S 1 S 1 S_(1)\mathcal{S}_{1}S1, but at different row kinds. We use superscripts to provide copies of type constructors at different kinds, and thus avoid overloading of kinds.
1.11.4 Definition [Row extension of a signature]: Let S 0 S 0 S_(0)\mathcal{S}_{0}S0 and S 1 S 1 S_(1)\mathcal{S}_{1}S1 be signatures where all symbols of S 1 S 1 S_(1)\mathcal{S}_{1}S1 are unary. The row extension of S 0 S 0 S_(0)\mathcal{S}_{0}S0 with S 1 S 1 S_(1)\mathcal{S}_{1}S1 is the signature S S S\mathcal{S}S defined as follows where κ κ kappa\kappaκ ranges over basic kinds (those used in S 0 S 0 S_(0)\mathcal{S}_{0}S0 and S 1 S 1 S_(1)\mathcal{S}_{1}S1 ) and s s sss ranges over row kinds:
F dom ( S ) F dom ( S ) F in dom(S)F \in \operatorname{dom}(\mathcal{S})Fdom(S) Signature Conditions
G s G s G^(s)G^{s}Gs ( K κ ) . s ( K κ ) . s (K=>kappa).s(K \Rightarrow \kappa) . s(Kκ).s ( G : K κ ) S 0 ( G : K κ ) S 0 (G:K=>kappa)inS_(0)(G: K \Rightarrow \kappa) \in \mathcal{S}_{0}(G:Kκ)S0
H H HHH K.Row ( ) κ ( ) κ (O/)=>kappa(\emptyset) \Rightarrow \kappa()κ. Type ( H : K κ ) S 1 ( H : K κ ) S 1 (H:K=>kappa)inS_(1)(H: K \Rightarrow \kappa) \in \mathcal{S}_{1}(H:Kκ)S1
κ , L κ , L del^(kappa,L)\partial^{\kappa, L}κ,L κ . ( κ . ( kappa.(\kappa .(κ.( Type Row ( L ) ) Row ( L ) ) =>Row(L))\Rightarrow \operatorname{Row}(L))Row(L))
κ , L κ , L ℓ^(kappa,L)\ell^{\kappa, L}κ,L κ . ( κ . ( kappa.(\kappa .(κ.( Type Row ( . L ) Row ( L ) ) Row ( . L ) Row ( L ) ) ox Row(ℓ.L)=>Row(L))\otimes \operatorname{Row}(\ell . L) \Rightarrow \operatorname{Row}(L))Row(.L)Row(L)) L L ℓ!in L\ell \notin LL
F in dom(S) Signature Conditions G^(s) (K=>kappa).s (G:K=>kappa)inS_(0) H K.Row (O/)=>kappa. Type (H:K=>kappa)inS_(1) del^(kappa,L) kappa.( Type =>Row(L)) ℓ^(kappa,L) kappa.( Type ox Row(ℓ.L)=>Row(L)) ℓ!in L| $F \in \operatorname{dom}(\mathcal{S})$ | Signature | Conditions | | :---: | :--- | :--- | | $G^{s}$ | $(K \Rightarrow \kappa) . s$ | $(G: K \Rightarrow \kappa) \in \mathcal{S}_{0}$ | | $H$ | K.Row $(\emptyset) \Rightarrow \kappa$. Type | $(H: K \Rightarrow \kappa) \in \mathcal{S}_{1}$ | | $\partial^{\kappa, L}$ | $\kappa .($ Type $\Rightarrow \operatorname{Row}(L))$ | | | $\ell^{\kappa, L}$ | $\kappa .($ Type $\otimes \operatorname{Row}(\ell . L) \Rightarrow \operatorname{Row}(L))$ | $\ell \notin L$ |
We usually write κ , L : t 1 κ , L : t 1 ℓ^(kappa,L):t_(1)\ell^{\kappa, L}: \mathrm{t}_{1}κ,L:t1; t 2 t 2 t_(2)\mathrm{t}_{2}t2 instead of κ , L t 1 t 2 κ , L t 1 t 2 ℓ^(kappa,L)t_(1)t_(2)\ell^{\kappa, L} \mathrm{t}_{1} \mathrm{t}_{2}κ,Lt1t2 and let this symbol be right-associative. We often drop the superscripts of type constructors since,
for any type expression T T T\mathrm{T}T, superscripts can be unambiguously recovered from the kind of T T T\mathrm{T}T.
1.11.5 EXAmple: Let us assume there is a single basic kind ***\star and that S 1 S 1 S_(1)\mathcal{S}_{1}S1 contain a unique type constructor Π Π Pi\PiΠ (hence of kind ***=>***\star \Rightarrow \star ). An example of row type is X 0 Π ( 1 : G ; ( Y x 0 ) ) X 0 Π 1 : G ; Y x 0 X_(0)rarr Pi(ℓ_(1):G;(Yrarr delx_(0)))\mathrm{X}_{0} \rightarrow \Pi\left(\ell_{1}: G ;\left(\mathrm{Y} \rightarrow \partial \mathrm{x}_{0}\right)\right)X0Π(1:G;(Yx0)). With all superscripts annotations, this type is
X 0 , Type Π ( 1 , Row ( ) : G Type ; ( Y , Row ( { 1 } ) , Row ( { 1 } ) X 0 ) ) . X 0 ,  Type  Π 1 ,  Row  ( ) : G Type  ; Y ,  Row  1 ,  Row  1 X 0 . X_(0)rarr^(***," Type ")Pi(ℓ_(1)^(***," Row "(O/)):G^("Type ");(Yrarr^(***," Row "({ℓ_(1)}))del^(***," Row "({ℓ_(1)}))X_(0))).\mathrm{X}_{0} \rightarrow^{\star, \text { Type }} \Pi\left(\ell_{1}^{\star, \text { Row }(\emptyset)}: G^{\text {Type }} ;\left(\mathrm{Y} \rightarrow^{\star, \text { Row }\left(\left\{\ell_{1}\right\}\right)} \partial^{\star, \text { Row }\left(\left\{\ell_{1}\right\}\right)} \mathrm{X}_{0}\right)\right) .X0, Type Π(1, Row ():GType ;(Y, Row ({1}), Row ({1})X0)).
Intuitively, this is the type of a function that takes a value of type X 0 X 0 X_(0)\mathrm{X}_{0}X0 and returns a record where field 1 1 ℓ_(1)\ell_{1}1 has type G G GGG and all other fields are functions that given a value of an arbitrary type would returns a value of (the same) type X 0 X 0 X_(0)\mathrm{X}_{0}X0. An instance of this type is X 0 Π ( 1 : G ; ( ( 2 : Y 2 ; Y ) ( 2 : X 0 ; X 0 ) ) ) X 0 Π 1 : G ; 2 : Y 2 ; Y 2 : X 0 ; X 0 X_(0)rarr Pi(ℓ_(1):G;((ℓ_(2):Y_(2);Y^('))rarr(ℓ_(2):X_(0);delX_(0))))\mathrm{X}_{0} \rightarrow \Pi\left(\ell_{1}: G ;\left(\left(\ell_{2}: \mathrm{Y}_{2} ; \mathrm{Y}^{\prime}\right) \rightarrow\left(\ell_{2}: \mathrm{X}_{0} ; \partial \mathrm{X}_{0}\right)\right)\right)X0Π(1:G;((2:Y2;Y)(2:X0;X0))), obtained by instantiating row variable Y Y Y\mathrm{Y}Y and by expanding the constant row X 0 X 0 delX_(0)\partial \mathrm{X}_{0}X0. As shown below, this type is actually equivalent to X 0 Π ( 1 : G X 0 Π 1 : G X_(0)rarr Pi(ℓ_(1):G:}\mathrm{X}_{0} \rightarrow \Pi\left(\ell_{1}: G\right.X0Π(1:G; 2 : Y 2 X 0 ; ( Y X 0 ) ) 2 : Y 2 X 0 ; Y X 0 {:ℓ_(2):Y_(2)rarrX_(0);(Y^(')rarr delX_(0)))\left.\ell_{2}: \mathrm{Y}_{2} \rightarrow \mathrm{X}_{0} ;\left(\mathrm{Y}^{\prime} \rightarrow \partial \mathrm{X}_{0}\right)\right)2:Y2X0;(YX0)), by distributivity of type constructor rarr\rightarrow other type constructor 2 2 ℓ_(2)\ell_{2}2. Please, note again the difference between Y Y Y\mathrm{Y}Y, which is a row variable that can expand to different type variables on different labels, and X X delX\partial \mathrm{X}X, which is a constant row that expands to the same type variable X X X\mathrm{X}X on all labels.
1.11.6 EXAMPLE [ ILL-KINDED EXPRESSION ]: Under the assumptions of the previous example, the expression X Π ( X ) X Π ( X ) Xrarr Pi(X)\mathrm{X} \rightarrow \Pi(\mathrm{X})XΠ(X) is not a row type, since variable X X X\mathrm{X}X cannot simultaneously be of row kind Type and Row ( ) Row ( ) Row(O/)\operatorname{Row}(\emptyset)Row() as required by its two occurrences, from left to right respectively. The expression ( : x ; : X ; x ) : x ; : X ; x (ℓ:x;ℓ:X^(');x^(''))\left(\ell: \mathrm{x} ; \ell: \mathrm{X}^{\prime} ; \mathrm{x}^{\prime \prime}\right)(:x;:X;x) is also ill-kinded, for the inner expression ( : X ; X ) : X ; X (ℓ:X^(');X^(''))\left(\ell: \mathrm{X}^{\prime} ; \mathrm{X}^{\prime \prime}\right)(:X;X) of row kind Row ( L ) Row ( L ) Row(L)\operatorname{Row}(L)Row(L) with L L ℓ!in L\ell \notin LL cannot also have row kind Row ( { } ) Row ( { } ) Row({ℓ})\operatorname{Row}(\{\ell\})Row({}), as required by its occurrence in the whole expression. Indeed, row kinds prohibit multiple definitions of the same label, as well as using rows in place of basic types and conversely. Notice that Π ( Π ( X ) ) Π ( Π ( X ) ) Pi(Pi(X))\Pi(\Pi(\mathrm{X}))Π(Π(X)) is also ill-formed, since type constructors of S 1 S 1 S_(1)\mathcal{S}_{1}S1 are not lifted to row kinds and thus cannot appear in rows, except under the type constructor del\partial, hence as basic types.
1.11.7 EXERCISE [ , ] [ , ] [*********,↛][\star \star \star, \nrightarrow][,] : Design an algorithm that infers superscripts of type constructors of a type expression from its kind. Can the kind of the expression be inferred as well? Can you give an algorithm to check that type expressions are well-kinded when both the superscripts of type constructors and the kinds of the whole type expressions are omitted?

Meaning of rows

As mentioned above, a row of kind Row ( L ) Row ( L ) Row(L)\operatorname{Row}(L)Row(L) stands for a function from L L L L L\\L\mathcal{L} \backslash LLL to types. Actually, it is simpler to represent this function explicitly as an
infinitely branching tree in the model. For this purpose, we use a collection of constructors L L LLL of (infinite but denumerable) arity L L L L L\\L\mathcal{L} \backslash LLL.
1.11.8 Definition [Row model]: Let S S S\mathcal{S}S be the row extension of a signature S 0 S 0 S_(0)\mathcal{S}_{0}S0 with a signature S 1 S 1 S_(1)\mathcal{S}_{1}S1. Let S M S M S_(M)\mathcal{S}_{\mathcal{M}}SM be the following signature, where κ κ kappa\kappaκ ranges over basic kinds and L L LLL ranges over finite subsets of L L L\mathcal{L}L :
F dom ( S M ) F dom S M F in dom(S_(M))F \in \operatorname{dom}\left(\mathcal{S}_{\mathcal{M}}\right)Fdom(SM) Signature Conditions
G G GGG ( K κ ) ( K κ ) (K=>kappa)(K \Rightarrow \kappa)(Kκ). Type ( G : K κ ) S 0 ( G : K κ ) S 0 (G:K=>kappa)inS_(0)(G: K \Rightarrow \kappa) \in \mathcal{S}_{0}(G:Kκ)S0
H H HHH K.Row ( ) κ ( ) κ (O/)=>kappa(\emptyset) \Rightarrow \kappa()κ. Type ( H : K κ ) S 1 ( H : K κ ) S 1 (H:K=>kappa)inS_(1)(H: K \Rightarrow \kappa) \in \mathcal{S}_{1}(H:Kκ)S1
L κ L κ L^(kappa)L^{\kappa}Lκ κ . ( κ . ( kappa.(\kappa .(κ.( Type L L Row ( L ) ) L L Row ( L ) ) L\\L=>Row(L))\mathcal{L} \backslash L \Rightarrow \operatorname{Row}(L))LLRow(L))
F in dom(S_(M)) Signature Conditions G (K=>kappa). Type (G:K=>kappa)inS_(0) H K.Row (O/)=>kappa. Type (H:K=>kappa)inS_(1) L^(kappa) kappa.( Type L\\L=>Row(L)) | $F \in \operatorname{dom}\left(\mathcal{S}_{\mathcal{M}}\right)$ | Signature | Conditions | | :--- | :--- | :--- | | $G$ | $(K \Rightarrow \kappa)$. Type | $(G: K \Rightarrow \kappa) \in \mathcal{S}_{0}$ | | $H$ | K.Row $(\emptyset) \Rightarrow \kappa$. Type | $(H: K \Rightarrow \kappa) \in \mathcal{S}_{1}$ | | $L^{\kappa}$ | $\kappa .($ Type $\mathcal{L} \backslash L \Rightarrow \operatorname{Row}(L))$ | |
Let M κ M κ M_(kappa)\mathcal{M}_{\kappa}Mκ consist of the regular trees t t ttt built over the signature S M S M S_(M)\mathcal{S}_{\mathcal{M}}SM such that t ( ϵ ) t ( ϵ ) t(epsilon)t(\epsilon)t(ϵ) has image kind κ κ kappa\kappaκ. We interpret a type constructor F F FFF of signature K κ . s K κ . s K=>kappa.sK \Rightarrow \kappa . sKκ.s as a function that maps T M K T M K T inM_(K)T \in \mathcal{M}_{K}TMK to t M κ . s t M κ . s t inM_(kappa.s)t \in \mathcal{M}_{\kappa . s}tMκ.s defined by cases on F F FFF and on the basic kind κ κ kappa\kappaκ.
F S F S F inSF \in \mathcal{S}FS t ( ϵ ) t ( ϵ ) t(epsilon)t(\epsilon)t(ϵ) For d dom ( K ) d dom ( K ) d in dom(K)d \in \operatorname{dom}(K)ddom(K) and L L , 0 L L , 0 ℓinL\\L,ℓ!=ℓ_(0)\ell \in \mathcal{L} \backslash L, \ell \neq \ell_{0}LL,0.
G Type G Type  G^("Type ")G^{\text {Type }}GType  G G GGG t / d = T ( d ) t / d = T ( d ) t//d=T(d)t / d=T(d)t/d=T(d)
H H HHH H H HHH t / 1 = T ( 1 ) t / 1 = T ( 1 ) t//1=T(1)t / 1=T(1)t/1=T(1)
G Row ( L ) G Row  ( L ) G^("Row "(L))G^{\text {Row }(L)}GRow (L) L κ L κ L^(kappa)L^{\kappa}Lκ t ( ) = G t / ( d ) = T ( d ) / t ( ) = G t / ( d ) = T ( d ) / t(ℓ)=G^^t//(ℓ*d)=T(d)//ℓt(\ell)=G \wedge t /(\ell \cdot d)=T(d) / \ellt()=Gt/(d)=T(d)/
κ , L κ , L del^(kappa,L)\partial^{\kappa, L}κ,L L κ L κ L^(kappa)L^{\kappa}Lκ t / = T ( 1 ) t / = T ( 1 ) t//ℓ=T(1)t / \ell=T(1)t/=T(1)
0 κ , L 0 κ , L ℓ_(0)^(kappa,L)\ell_{0}^{\kappa, L}0κ,L L κ L κ L^(kappa)L^{\kappa}Lκ t / 0 = T ( 1 ) t / = T ( 2 ) / t / 0 = T ( 1 ) t / = T ( 2 ) / t//ℓ_(0)=T(1)^^t//ℓ=T(2)//ℓt / \ell_{0}=T(1) \wedge t / \ell=T(2) / \ellt/0=T(1)t/=T(2)/
F inS t(epsilon) For d in dom(K) and ℓinL\\L,ℓ!=ℓ_(0). G^("Type ") G t//d=T(d) H H t//1=T(1) G^("Row "(L)) L^(kappa) t(ℓ)=G^^t//(ℓ*d)=T(d)//ℓ del^(kappa,L) L^(kappa) t//ℓ=T(1) ℓ_(0)^(kappa,L) L^(kappa) t//ℓ_(0)=T(1)^^t//ℓ=T(2)//ℓ| $F \in \mathcal{S}$ | $t(\epsilon)$ | For $d \in \operatorname{dom}(K)$ and $\ell \in \mathcal{L} \backslash L, \ell \neq \ell_{0}$. | | :--- | :--- | :--- | | $G^{\text {Type }}$ | $G$ | $t / d=T(d)$ | | $H$ | $H$ | $t / 1=T(1)$ | | $G^{\text {Row }(L)}$ | $L^{\kappa}$ | $t(\ell)=G \wedge t /(\ell \cdot d)=T(d) / \ell$ | | $\partial^{\kappa, L}$ | $L^{\kappa}$ | $t / \ell=T(1)$ | | $\ell_{0}^{\kappa, L}$ | $L^{\kappa}$ | $t / \ell_{0}=T(1) \wedge t / \ell=T(2) / \ell$ |
In the presence of subtyping, we let type constructors G G GGG and H H HHH behave in S M S M S_(M)\mathcal{S}_{\mathcal{M}}SM as in S 0 S 0 S_(0)\mathcal{S}_{0}S0 and S 1 S 1 S_(1)\mathcal{S}_{1}S1 and type constructors L κ L κ L^(kappa)L^{\kappa}Lκ be isolated and covariant in every position. Models that define ground types and interpret type constructors in this manner are referred to as row models.

Reasoning with row types

In this section, we assume a subtyping model. All reasoning principles also apply to the equality-only model, which is a subcase of the subtyping model.
The meaning of rows has been carefully defined so as to be independent of some syntactical choices. In particular, the order in which the types of significant fields have been declared leaves the meaning of rows unchanged. This is formally stated by the following Lemma.
1.11.9 Lemma: The equations of Figure 1-17 are equivalent to true.
Proof: Each equation can be considered independently. It suffices to see that any ground assignment ϕ ϕ phi\phiϕ sends both sides of the equation to the same element
(C-Row-LL) ( 1 : T 1 ; 2 : T 2 ; T 3 ) = ( 2 : T 2 ; 1 : T 1 ; T 3 ) (C-Row-DL) T = ( : T ; T ) (C-Row-DG) ( G T 1 T n ) = G T 1 T n (C-Row-GL) G ( : T 1 ; T 1 ) ( : T n ; T n ) = ( : G T 1 T n ; G T 1 T n ) (C-Row-LL) 1 : T 1 ; 2 : T 2 ; T 3 = 2 : T 2 ; 1 : T 1 ; T 3 (C-Row-DL) T = ( : T ; T ) (C-Row-DG) G T 1 T n = G T 1 T n (C-Row-GL) G : T 1 ; T 1 : T n ; T n = : G T 1 T n ; G T 1 T n {:[(C-Row-LL)(ℓ_(1):T_(1);ℓ_(2):T_(2);T_(3))=(ℓ_(2):T_(2);ℓ_(1):T_(1);T_(3))],[(C-Row-DL)delT=(ℓ:T;delT)],[(C-Row-DG)del(GT_(1)dotsT_(n))=G delT_(1)dots delT_(n)],[(C-Row-GL)G(ℓ:T_(1);T_(1)^('))dots(ℓ:T_(n);T_(n)^('))=(ℓ:GT_(1)dotsT_(n);GT_(1)^(')dotsT_(n)^('))]:}\begin{align*} \left(\ell_{1}: \mathrm{T}_{1} ; \ell_{2}: \mathrm{T}_{2} ; \mathrm{T}_{3}\right) & =\left(\ell_{2}: \mathrm{T}_{2} ; \ell_{1}: \mathrm{T}_{1} ; \mathrm{T}_{3}\right) \tag{C-Row-LL}\\ \partial \mathrm{T} & =(\ell: \mathrm{T} ; \partial \mathrm{T}) \tag{C-Row-DL}\\ \partial\left(G \mathrm{~T}_{1} \ldots \mathrm{T}_{n}\right) & =G \partial \mathrm{T}_{1} \ldots \partial \mathrm{T}_{n} \tag{C-Row-DG}\\ G\left(\ell: \mathrm{T}_{1} ; \mathrm{T}_{1}^{\prime}\right) \ldots\left(\ell: \mathrm{T}_{n} ; \mathrm{T}_{n}^{\prime}\right) & =\left(\ell: G \mathrm{~T}_{1} \ldots \mathrm{T}_{n} ; G \mathrm{~T}_{1}^{\prime} \ldots \mathrm{T}_{n}^{\prime}\right) \tag{C-Row-GL} \end{align*}(C-Row-LL)(1:T1;2:T2;T3)=(2:T2;1:T1;T3)(C-Row-DL)T=(:T;T)(C-Row-DG)(G T1Tn)=GT1Tn(C-Row-GL)G(:T1;T1)(:Tn;Tn)=(:G T1Tn;G T1Tn)

Figure 1-17: Equational reasoning for row types.

( 1 : T 1 ; T 1 ) = ( 2 : T 2 ; T 2 ) X ( T 1 = ( 2 : T 2 ; X ) T 2 = ( 1 : T 1 ; x ) ) if X # ftv ( T 1 , T 1 , T 2 , T 2 ) 1 2 ( : T ; T ) = G T i I ( x i , X i ) I ( T = G X i I T = G X i T i = ( : x i ; X i ) ) if ( X i , X i ) I # f t v ( T , T , T i I ) (C-MuTE-LG) T = G T i I x i I ( T = G x i I ( T i = x i ) I ) if X i I # ftv ( T , T i I ) T = ( : T ; T ) T = T T = T 1 : T 1 ; T 1 = 2 : T 2 ; T 2 X T 1 = 2 : T 2 ; X T 2 = 1 : T 1 ; x  if  X # ftv T 1 , T 1 , T 2 , T 2 1 2 : T ; T = G T i I x i , X i I T = G X i I T = G X i T i = : x i ; X i  if  X i , X i I # f t v T , T , T i I  (C-MuTE-LG)  T = G T i I x i I T = G x i I T i = x i I  if  X i I # ftv T , T i I T = : T ; T T = T T = T {:[(ℓ_(1):T_(1);T_(1)^('))=(ℓ_(2):T_(2);T_(2)^('))-=EEX*(T_(1)^(')=(ℓ_(2):T_(2);X)^^T_(2)^(')=(ℓ_(1):T_(1);x))],[" if "X#ftv(T_(1),T_(1)^('),T_(2),T_(2)^('))^^ℓ_(1)!=ℓ_(2)],[(ℓ:T;T^('))=GT_(i)^(I)-=EE(x_(i),X_(i)^('))^(I)*(T=GX_(i)^(I)^^T^(')=GX_(i)^(')^^T_(i)=(ℓ:x_(i);X_(i)^(')))],[" if "(X_(i),X_(i))^(I)#ftv((T),T^('),T_(i)^(I))],[" (C-MuTE-LG) "],[delT=GT_(i)^(I)-=EEx_(i)^(I)*(T=Gx_(i)^(I)^^(T_(i)=delx_(i))^(I))],[" if "X_(i)^(I)#ftv(T,T_(i)^(I))],[delT=(ℓ:T^(');T^(''))-=T=T^(')^^delT=T^('')]:}\begin{aligned} & \left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{T}_{1}^{\prime}\right)=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{T}_{2}^{\prime}\right) \equiv \exists \mathrm{X} \cdot\left(\mathrm{T}_{1}^{\prime}=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{X}\right) \wedge \mathrm{T}_{2}^{\prime}=\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{x}\right)\right) \\ & \text { if } \mathrm{X} \# \operatorname{ftv}\left(\mathrm{T}_{1}, \mathrm{~T}_{1}^{\prime}, \mathrm{T}_{2}, \mathrm{~T}_{2}^{\prime}\right) \wedge \ell_{1} \neq \ell_{2} \\ & \left(\ell: \mathrm{T} ; \mathrm{T}^{\prime}\right)=G \mathrm{~T}_{i}^{I} \equiv \exists\left(\mathrm{x}_{i}, \mathrm{X}_{i}^{\prime}\right)^{I} \cdot\left(\mathrm{T}=G \mathrm{X}_{i}^{I} \wedge \mathrm{T}^{\prime}=G \mathrm{X}_{i}^{\prime} \wedge \mathrm{T}_{i}=\left(\ell: \mathrm{x}_{i} ; \mathrm{X}_{i}^{\prime}\right)\right) \\ & \text { if }\left(\mathrm{X}_{i}, \mathrm{X}_{i}\right)^{I} \# f t v\left(\mathrm{~T}, \mathrm{~T}^{\prime}, \mathrm{T}_{i}^{I}\right) \\ & \text { (C-MuTE-LG) } \\ & \partial \mathrm{T}=G \mathrm{~T}_{i}^{I} \equiv \exists \mathrm{x}_{i}^{I} \cdot\left(\mathrm{T}=G \mathrm{x}_{i}^{I} \wedge\left(\mathrm{T}_{i}=\partial \mathrm{x}_{i}\right)^{I}\right) \\ & \text { if } \mathrm{X}_{i}^{I} \# \operatorname{ftv}\left(\mathrm{T}, \mathrm{T}_{i}^{I}\right) \\ & \partial \mathrm{T}=\left(\ell: \mathrm{T}^{\prime} ; \mathrm{T}^{\prime \prime}\right) \equiv \mathrm{T}=\mathrm{T}^{\prime} \wedge \partial \mathrm{T}=\mathrm{T}^{\prime \prime} \end{aligned}(1:T1;T1)=(2:T2;T2)X(T1=(2:T2;X)T2=(1:T1;x)) if X#ftv(T1, T1,T2, T2)12(:T;T)=G TiI(xi,Xi)I(T=GXiIT=GXiTi=(:xi;Xi)) if (Xi,Xi)I#ftv( T, T,TiI) (C-MuTE-LG) T=G TiIxiI(T=GxiI(Ti=xi)I) if XiI#ftv(T,TiI)T=(:T;T)T=TT=T

Figure 1-18: Constraint equivalence laws for rows.

in the model, which follows directly from the meaning of row types. Notice that this fact only depends on the semantics of types and not on the semantics of the subtyping predicate.
It follows from those equations that type constructors , , ℓ,del\ell, \partial,, and G G GGG are never isolated, each equation exhibiting a pair of compatible top symbols. Variances and incompatible pairs of type constructors depend on the signature S 0 S 1 S 0 S 1 S_(0)⊎S_(1)\mathcal{S}_{0} \uplus \mathcal{S}_{1}S0S1. However, it is not difficult to see that type constructors del\partial and \ell are logically covariant in all directions and that the logical variance of types constructors G G GGG of dom ( S 0 S 1 ) dom S 0 S 1 dom(S_(0)⊎S_(1))\operatorname{dom}\left(\mathcal{S}_{0} \uplus \mathcal{S}_{1}\right)dom(S0S1) correspond to their syntactic variance, which, in most cases, will allow the decomposition of equations with the same top symbols. Moreover, an equation between two terms whose top symbols form one of the four compatible pairs derived from the equations of Figure 1-17 holds only if immediate subexpressions can be "conciliated" in some way. There is a transformation quite similar to decomposition, called mutation, that mimics the equations for rows (Figure 1-17) and described by the rules of Figure 1-18. For sake of readability and conciseness, we write T i I T i I T_(i)^(I)\mathrm{T}_{i}^{I}TiI instead of T i i I T i i I T_(i)^(i in I)\mathrm{T}_{i}^{i \in I}TiiI.
1.11.10 Lemma [Mutation]: All equivalence laws in Figure 1-18 hold.
Proof:
  • Case C-Mute-LL: Let X # ftv( T 1 , T 1 , T 2 , T 2 ) ( 1 ) T 1 , T 1 , T 2 , T 2 ( 1 ) {:T_(1),T_(1)^('),T_(2),T_(2)^('))(1)\left.\mathrm{T}_{1}, \mathrm{~T}_{1}^{\prime}, \mathrm{T}_{2}, \mathrm{~T}_{2}^{\prime}\right)(\mathbf{1})T1, T1,T2, T2)(1) and 1 2 1 2 ℓ_(1)!=ℓ_(2)\ell_{1} \neq \ell_{2}12. Let Row ( L ) Row ( L ) Row(L)\operatorname{Row}(L)Row(L) be the row kind of this equation. Let ϕ ϕ phi\phiϕ be a ground assignment that validates the constraint ( 1 : T 1 ; T 1 ) = ( 2 : T 2 ; T 2 ) 1 : T 1 ; T 1 = 2 : T 2 ; T 2 (ℓ_(1):T_(1);T_(1)^('))=(ℓ_(2):T_(2);T_(2)^('))\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{T}_{1}^{\prime}\right)=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{T}_{2}^{\prime}\right)(1:T1;T1)=(2:T2;T2). That is, ϕ ϕ phi\phiϕ sends all terms of the multi-equation to the same ground type t t ttt of row kind Row ( L ) Row ( L ) Row(L)\operatorname{Row}(L)Row(L). Moreover, the row-term semantics implies that t t ttt satisfies t ( ϵ ) = L , t / 1 = ϕ ( T 1 ) = t ( ϵ ) = L , t / 1 = ϕ T 1 = t(epsilon)=L,t//ℓ_(1)=phi(T_(1))=t(\epsilon)=L, t / \ell_{1}=\phi\left(\mathrm{T}_{1}\right)=t(ϵ)=L,t/1=ϕ(T1)= ϕ ( T 2 ) / 1 , t / 2 = ϕ ( T 1 ) / 2 = ϕ ( T 2 ) ϕ T 2 / 1 , t / 2 = ϕ T 1 / 2 = ϕ T 2 phi(T_(2)^('))//ℓ_(1),t//ℓ_(2)=phi(T_(1)^('))//ℓ_(2)=phi(T_(2))\phi\left(\mathrm{T}_{2}^{\prime}\right) / \ell_{1}, t / \ell_{2}=\phi\left(\mathrm{T}_{1}^{\prime}\right) / \ell_{2}=\phi\left(\mathrm{T}_{2}\right)ϕ(T2)/1,t/2=ϕ(T1)/2=ϕ(T2), and t / = ϕ ( T 2 ) / = ϕ ( T 1 ) / t / = ϕ T 2 / = ϕ T 1 / t//ℓ=phi(T_(2)^('))//ℓ=phi(T_(1)^('))//ℓt / \ell=\phi\left(\mathrm{T}_{2}^{\prime}\right) / \ell=\phi\left(\mathrm{T}_{1}^{\prime}\right) / \ellt/=ϕ(T2)/=ϕ(T1)/ for all L 1 2 . L L 1 2 . L ℓinL\\ℓ_(1)*ℓ_(2).L\ell \in \mathcal{L} \backslash \ell_{1} \cdot \ell_{2} . LL12.L (2). Let t t t^(')t^{\prime}t be the tree defined by t ( ϵ ) = 1 2 . L t ( ϵ ) = 1 2 . L t^(')(epsilon)=ℓ_(1)*ℓ_(2).Lt^{\prime}(\epsilon)=\ell_{1} \cdot \ell_{2} . Lt(ϵ)=12.L and t / = t / t / = t / t^(')//ℓ=t//ℓt^{\prime} / \ell=t / \ellt/=t/ for all L 1 2 . L L 1 2 . L ℓinL\\ℓ_(1)*ℓ_(2).L\ell \in \mathcal{L} \backslash \ell_{1} \cdot \ell_{2} . LL12.L. By construction and (2), ϕ [ X t ] ϕ X t phi[X|->t^(')]\phi\left[\mathrm{X} \mapsto t^{\prime}\right]ϕ[Xt] satisfies both equations T 1 = ( 2 : T 2 ; x ) T 1 = 2 : T 2 ; x T_(1)^(')=(ℓ_(2):T_(2);x)\mathrm{T}_{1}^{\prime}=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{x}\right)T1=(2:T2;x) and T 2 = ( 1 : T 1 ; x ) T 2 = 1 : T 1 ; x T_(2)^(')=(ℓ_(1):T_(1);x)\mathrm{T}_{2}^{\prime}=\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{x}\right)T2=(1:T1;x). Thus by CM-ExisTs and (1), ϕ ϕ phi\phiϕ satisfies X T 1 = ( 2 : T 2 ; X ) T 2 = ( 1 : T 1 ; X ) X T 1 = 2 : T 2 ; X T 2 = 1 : T 1 ; X EEX*T_(1)^(')=(ℓ_(2):T_(2);X)^^T_(2)^(')=(ℓ_(1):T_(1);X)\exists \mathrm{X} \cdot \mathrm{T}_{1}^{\prime}=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{X}\right) \wedge \mathrm{T}_{2}^{\prime}=\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{X}\right)XT1=(2:T2;X)T2=(1:T1;X). Conversely, we have the entailment:
X . ( T 1 = ( 2 : T 2 ; X ) T 2 = ( 1 : T 1 ; X ) ) X . ( ( 1 : T 1 ; T 1 ) = ( 1 : T 1 ; 2 : T 2 ; X ) (3) ( 2 : T 2 ; T 2 ) = ( 2 : T 2 ; 1 : T 1 ; X ) ) (4) X . ( 1 : T 1 ; T 1 ) = ( 2 : T 2 ; T 2 ) (5) ( 1 : T 1 ; T 1 ) = ( 2 : T 2 ; T 2 ) X . T 1 = 2 : T 2 ; X T 2 = 1 : T 1 ; X X . 1 : T 1 ; T 1 = 1 : T 1 ; 2 : T 2 ; X (3) 2 : T 2 ; T 2 = 2 : T 2 ; 1 : T 1 ; X (4) X . 1 : T 1 ; T 1 = 2 : T 2 ; T 2 (5) 1 : T 1 ; T 1 = 2 : T 2 ; T 2 {:[EEX.(T_(1)^(')=(ℓ_(2):T_(2);X)^^T_(2)^(')=(ℓ_(1):T_(1);X))],[-=EEX.((ℓ_(1):T_(1);T_(1)^('))=(ℓ_(1):T_(1);ℓ_(2):T_(2);X)^^:}],[(3){: quad(ℓ_(2):T_(2);T_(2)^('))=(ℓ_(2):T_(2);ℓ_(1):T_(1);X))],[(4)⊩EEX.(ℓ_(1):T_(1);T_(1)^('))=(ℓ_(2):T_(2);T_(2)^('))],[(5)-=(ℓ_(1):T_(1);T_(1)^('))=(ℓ_(2):T_(2);T_(2)^('))]:}\begin{align*} & \exists \mathrm{X} .\left(\mathrm{T}_{1}^{\prime}=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{X}\right) \wedge \mathrm{T}_{2}^{\prime}=\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{X}\right)\right) \\ & \equiv \exists \mathrm{X} .\left(\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{T}_{1}^{\prime}\right)=\left(\ell_{1}: \mathrm{T}_{1} ; \ell_{2}: \mathrm{T}_{2} ; \mathrm{X}\right) \wedge\right. \\ &\left.\quad\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{T}_{2}^{\prime}\right)=\left(\ell_{2}: \mathrm{T}_{2} ; \ell_{1}: \mathrm{T}_{1} ; \mathrm{X}\right)\right) \tag{3}\\ & \Vdash \exists \mathrm{X} .\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{T}_{1}^{\prime}\right)=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{T}_{2}^{\prime}\right) \tag{4}\\ & \equiv\left(\ell_{1}: \mathrm{T}_{1} ; \mathrm{T}_{1}^{\prime}\right)=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{T}_{2}^{\prime}\right) \tag{5} \end{align*}X.(T1=(2:T2;X)T2=(1:T1;X))X.((1:T1;T1)=(1:T1;2:T2;X)(3)(2:T2;T2)=(2:T2;1:T1;X))(4)X.(1:T1;T1)=(2:T2;T2)(5)(1:T1;T1)=(2:T2;T2)
(3) follows by covariance of 1 1 ℓ_(1)\ell_{1}1 and 2 2 ℓ_(2)\ell_{2}2; (4) by C-Row-LL and transitivity of equivalence; (5) by C E x C E x C-Ex**\mathrm{C}-\mathrm{Ex} *CEx and (1).
  • Cases C-Mute-LG, C-Mute-DG, and C-Mute-DL: The reasoning is similar to the case C-Mute-LL.

Solving row constraints in an equality model

In this section, we extend the constraint solver for the equality-only free tree model (Figure 1-11), so as to handle rows. We thus assume an equality-only model.
Mutation is a common technique to solve equations in a large class of nonfree algebras that are described by syntactic theories (Kirchner and Klay, 1990). The equations of Figure 1-17 happen to be a syntactic presentation of an equational theory, from which a unification algorithm could be automatically derived (Rémy, 1993). We recover the same transformation rules directly, without using results on syntactic theories.
The following lemma shows that all pairs of distinct type constructors for which there is no mutation rule are in fact incompatible.
1.11.11 Lemma: All symbols H S 1 H S 1 H inS_(1)H \in \mathcal{S}_{1}HS1 are isolated. Furthermore for every pair of distinct type constructors G 1 , G 2 dom ( S 0 S 1 ) G 1 , G 2 dom S 0 S 1 G_(1),G_(2)in dom(S_(0)⊎S_(1))G_{1}, G_{2} \in \operatorname{dom}\left(\mathcal{S}_{0} \uplus \mathcal{S}_{1}\right)G1,G2dom(S0S1), and every row kind s s sss, we have G 1 s G 2 s G 1 s G 2 s G_(1)^(s)|><|G_(2)^(s)G_{1}^{s} \bowtie G_{2}^{s}G1sG2s.
Proof: Terms of the form H T H T H vec(T)H \overrightarrow{\mathrm{T}}HT are interpreted by ground types with H H HHH at occurrence ϵ ϵ epsilon\epsilonϵ, and conversely the only interpretations of types with H H HHH at occurrence ϵ ϵ epsilon\epsilonϵ are terms of the form H T H T H vec(T)H \overrightarrow{\mathrm{T}}HT. Hence, no ground assignment can ever
( 1 : X 1 ; X 1 ) = ( 2 : T 2 ; T 2 ) = ϵ Y . ( X 1 = ( 2 : T 2 ; Y ) T 2 = ( 1 : X 1 ; Y ) ) ( 1 : X 1 ; X 1 ) = ϵ if Y # ftv ( X 1 , X 1 , T 2 , T 2 ) 1 2 ( : X ; X ) = G T i i I = ϵ ( Y i , Y i ) i I ( X = G Y i i I X = G Y i i i I T i = ( : Y i ; Y i ) ) ( : X ; X ) = ϵ if ( Y i , Y i ) i I # f t v ( X , X , T i i I ) X = G T i i I = ϵ Y i i I ( X = G Y i i I ( T i = Y i ) i I ) x = ϵ if Y i i I # f t v ( X , T i i I ) x = ( : T ; T ) = ϵ X = T x = T x = ϵ G T = G T = ϵ false if F F 1 : X 1 ; X 1 = 2 : T 2 ; T 2 = ϵ Y . X 1 = 2 : T 2 ; Y T 2 = 1 : X 1 ; Y 1 : X 1 ; X 1 = ϵ  if  Y # ftv X 1 , X 1 , T 2 , T 2 1 2 : X ; X = G T i i I = ϵ Y i , Y i i I X = G Y i i I X = G Y i i i I T i = : Y i ; Y i : X ; X = ϵ  if  Y i , Y i i I # f t v X , X , T i i I X = G T i i I = ϵ Y i i I X = G Y i i I T i = Y i i I x = ϵ  if  Y i i I # f t v X , T i i I x = : T ; T = ϵ X = T x = T x = ϵ G T = G T = ϵ  false   if  F F {:[(ℓ_(1):X_(1);X_(1)^('))=(ℓ_(2):T_(2);T_(2)^('))=epsilonquad rarrquad EEY.(X_(1)^(')=(ℓ_(2):T_(2);Y)^^T_(2)^(')=(ℓ_(1):X_(1);Y))],[^^(ℓ_(1):X_(1);X_(1)^('))=epsilon],[" if "Y#ftv(X_(1),X_(1)^('),T_(2),T_(2)^('))^^ℓ_(1)!=ℓ_(2)],[(ℓ:X;X^('))=GT_(i)^(i in I)=epsilonquad rarrquad EE(Y_(i),Y_(i)^('))^(i in I)*(X=GY_(i)^(i in I)^^X^(')=GY_(i)^(')_(i)^(i in I)^^T_(i)=(ℓ:Y_(i);Y_(i)^(')))],[^^(ℓ:X;X^('))=epsilon],[" if "(Y_(i),Y_(i)^('))^(i in I)#ftv(X,X^('),T_(i)^(i in I))],[delX=GT_(i)^(i in I)=epsilonquad rarrquad EEY_(i)^(i in I)*(X=GY_(i)^(i in I)^^(T_(i)=delY_(i))^(i in I))],[^^delx=epsilon],[" if "Y_(i)^(i in I)#ftv(X,T_(i)^(i in I))],[delx=(ℓ:T;T^('))=epsilonquad rarrquadX=T^^delx=T^(')^^delx=epsilon],[G vec(T)=G^(') vec(T)^(')=epsilon rarr" false "],[" if "F|><|F^(')]:}\begin{aligned} & \left(\ell_{1}: \mathrm{X}_{1} ; \mathrm{X}_{1}^{\prime}\right)=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{T}_{2}^{\prime}\right)=\epsilon \quad \rightarrow \quad \exists \mathrm{Y} .\left(\mathrm{X}_{1}^{\prime}=\left(\ell_{2}: \mathrm{T}_{2} ; \mathrm{Y}\right) \wedge \mathrm{T}_{2}^{\prime}=\left(\ell_{1}: \mathrm{X}_{1} ; \mathrm{Y}\right)\right) \\ & \wedge\left(\ell_{1}: \mathrm{X}_{1} ; \mathrm{X}_{1}^{\prime}\right)=\epsilon \\ & \text { if } \mathrm{Y} \# \operatorname{ftv}\left(\mathrm{X}_{1}, \mathrm{X}_{1}^{\prime}, \mathrm{T}_{2}, \mathrm{~T}_{2}^{\prime}\right) \wedge \ell_{1} \neq \ell_{2} \\ & \left(\ell: \mathrm{X} ; \mathrm{X}^{\prime}\right)=G \mathrm{~T}_{i}^{i \in I}=\epsilon \quad \rightarrow \quad \exists\left(\mathrm{Y}_{i}, \mathrm{Y}_{i}^{\prime}\right)^{i \in I} \cdot\left(\mathrm{X}=G \mathrm{Y}_{i}^{i \in I} \wedge \mathrm{X}^{\prime}=G \mathrm{Y}_{i}^{\prime}{ }_{i}^{i \in I} \wedge \mathrm{T}_{i}=\left(\ell: \mathrm{Y}_{i} ; \mathrm{Y}_{i}^{\prime}\right)\right) \\ & \wedge\left(\ell: X ; X^{\prime}\right)=\epsilon \\ & \text { if }\left(\mathrm{Y}_{i}, \mathrm{Y}_{i}^{\prime}\right)^{i \in I} \# f t v\left(\mathrm{X}, \mathrm{X}^{\prime}, \mathrm{T}_{i}^{i \in I}\right) \\ & \partial \mathrm{X}=G \mathrm{~T}_{i}^{i \in I}=\epsilon \quad \rightarrow \quad \exists \mathrm{Y}_{i}^{i \in I} \cdot\left(\mathrm{X}=G \mathrm{Y}_{i}^{i \in I} \wedge\left(\mathrm{T}_{i}=\partial \mathrm{Y}_{i}\right)^{i \in I}\right) \\ & \wedge \partial \mathrm{x}=\epsilon \\ & \text { if } \mathrm{Y}_{i}^{i \in I} \# f t v\left(\mathrm{X}, \mathrm{T}_{i}^{i \in I}\right) \\ & \partial \mathrm{x}=\left(\ell: \mathrm{T} ; \mathrm{T}^{\prime}\right)=\epsilon \quad \rightarrow \quad \mathrm{X}=\mathrm{T} \wedge \partial \mathrm{x}=\mathrm{T}^{\prime} \wedge \partial \mathrm{x}=\epsilon \\ & G \overrightarrow{\mathrm{T}}=G^{\prime} \overrightarrow{\mathrm{T}}^{\prime}=\epsilon \rightarrow \text { false } \\ & \text { if } F \bowtie F^{\prime} \end{aligned}(1:X1;X1)=(2:T2;T2)=ϵY.(X1=(2:T2;Y)T2=(1:X1;Y))(1:X1;X1)=ϵ if Y#ftv(X1,X1,T2, T2)12(:X;X)=G TiiI=ϵ(Yi,Yi)iI(X=GYiiIX=GYiiiITi=(:Yi;Yi))(:X;X)=ϵ if (Yi,Yi)iI#ftv(X,X,TiiI)X=G TiiI=ϵYiiI(X=GYiiI(Ti=Yi)iI)x=ϵ if YiiI#ftv(X,TiiI)x=(:T;T)=ϵX=Tx=Tx=ϵGT=GT=ϵ false  if FF

Figure 1-19: Unification addendum for row types

send H T H T H vec(T)H \overrightarrow{\mathrm{T}}HT and F T F T F vec(T^('))F \overrightarrow{\mathrm{T}^{\prime}}FT to the same ground term when F H F H F!=HF \neq HFH and, as a result, H H HHH is isolated.
Let G 1 G 1 G_(1)G_{1}G1 and G 2 G 2 G_(2)G_{2}G2 be two type constructors of S 0 S 0 S_(0)\mathcal{S}_{0}S0. For s = s = s=s=s= Type, the interpretations of terms of the form G 1 s T G 1 s T G_(1)^(s) vec(T)G_{1}^{s} \overrightarrow{\mathrm{T}}G1sT and G 2 s T G 2 s T G_(2)^(s) vec(T)^(')G_{2}^{s} \overrightarrow{\mathrm{T}}^{\prime}G2sT are ground types with symbols G 1 G 1 G_(1)G_{1}G1 and G 2 G 2 G_(2)G_{2}G2 at occurrence ϵ ϵ epsilon\epsilonϵ, respectively. Hence they cannot be made equal under any ground assignment. For s = Row ( L ) s = Row ( L ) s=Row(L)s=\operatorname{Row}(L)s=Row(L), the interpretations of terms of the form G 1 s T G 1 s T G_(1)^(s) vec(T)G_{1}^{s} \overrightarrow{\mathrm{T}}G1sT and G 2 s T G 2 s T G_(2)^(s) vec(T)^(')G_{2}^{s} \overrightarrow{\mathrm{T}}^{\prime}G2sT are ground types with constructor L L LLL at occurrence ϵ ϵ epsilon\epsilonϵ and constructors G 1 G 1 G_(1)G_{1}G1 and G 2 G 2 G_(2)G_{2}G2 at occurrence 1, respectively. Hence they cannot be made equal under any ground assignment.
Any other combination of type constructors forms a compatible pair, as illustrated by equations of Figure 1-17 and can be transformed by mutation rules of Figure 1-18. The constraint solver for row-terms is the relation rarr^(†)\rightarrow^{\dagger} defined by the rewriting rules of Figure 1-11, except Rule S-CLASH, which is replaced by the set of rules of Figure 1-19.
1.11.12 Lemma: The rewriting system rarr^(†)\rightarrow^{\dagger} is strongly normalizing.
Please, note that the termination of rarr^(†)\rightarrow^{\dagger} relies on types being well-kinded. In particular, rarr^(†)\rightarrow^{\dagger} would not terminate on the ill-kinded system of equations X = : T ; X X = : T ; X X = : T ; X X = : T ; X X=ℓ:T;X^(')^^X^(')=ℓ^('):T;X\mathrm{X}=\ell: \mathrm{T} ; \mathrm{X}^{\prime} \wedge \mathrm{X}^{\prime}=\ell^{\prime}: \mathrm{T} ; \mathrm{X}X=:T;XX=:T;X.
1.11.13 Lemma: If U U U U Urarr^(†)U^(')U \rightarrow^{\dagger} U^{\prime}UU, then U U U U U-=U^(')U \equiv U^{\prime}UU.
Proof: It suffices to check the property independently for each rule defining rarr^(†)\rightarrow^{\dagger}. The proof for rules of Figure 1-11 but S-CLASH remain valid for row terms. For S-Decompose, it follows by the invariance of all type constructors, which is preserved for row terms. For rule S-CLASS-I it follows by Lemma 1.11.11 and for mutation rules, it follows by Lemma 1.11.10.
Although reductions rarr\rightarrow are not sound for row types, hence rarr\rightarrow cannot be used for computation over row types, it has some interest. In particular, the following property shows that normal forms for row types are identical to normal forms for regular types.
1.11.14 Lemma: A system U U UUU in normal form for rarr^(†)\rightarrow^{\dagger} is also in normal form for rarr\rightarrow.
Proof: The only rule of rarr\rightarrow that is not in rarr^(†)\rightarrow^{\dagger} is S-CLASH. Thus, it suffices to observe that if Rule S-CLASH would be applicable, then either Rule S-CLASS-I or a mutation rule would be applicable as well.
As a corollary, Lemma 1.8.6 extends to row types.

Operations on records

We now illustrate the use of rows for typechecking operations on records. The simplest application of rows are full records with infinite carrier. Records types are expressed with rows instead of a simple product type. The basic operations are the same as in the finite case, that is, creation, polymorphic update, and polymorphic access, but labels are now taken from an infinite set. However, creation and polymorphic update, which where destructors are now taken as constructors and used to represent records as association lists. The access of a record v v v\mathrm{v}v at a field \ell is obtained by linearly searching v v v\mathrm{v}v for a definition of field \ell and returning this definition, or returning the default value if no definition has been found for \ell.
1.11.15 Example [Full Records]: We assume a unique basic kind ***\star and a unique covariant isolated type constructor Π Π Pi\PiΠ in S 1 S 1 S_(1)\mathcal{S}_{1}S1. Let { } { } {*}\{\cdot\}{} be a unary constructor, ( { with = } ) L ( {  with  = } ) L ({*" with "*=ℓ})^(ℓinL)(\{\cdot \text { with } \cdot=\ell\})^{\ell \in \mathcal{L}}({ with =})L a collection of binary constructors, and ( . { } ) L ( . { } ) L (ℓ.{*})^(ℓinL)(\ell .\{\cdot\})^{\ell \in \mathcal{L}}(.{})L a collection of unary destructors with the following reduction rules:
{ v } { } δ v { w with = v } { } δ v { w with = v } { } w . { } if (RD-DEFAULT) (RD-FOLLOw) { v } { } δ v { w  with  = v } { } δ v w  with  = v { } w . { }  if   (RD-DEFAULT)   (RD-FOLLOw)  {:[{v}*{ℓ},rarr"delta",v,],[{w" with "ℓ=v}*{ℓ},rarr"delta",v,],[{w" with "ℓ^(')=v}*{ℓ},,],[,w.{ℓ}," if "ℓ!=ℓ^(')," (RD-DEFAULT) "],[" (RD-FOLLOw) "]:}\begin{array}{rllr} \{\mathrm{v}\} \cdot\{\ell\} & \xrightarrow{\delta} & \mathrm{v} & \\ \{\mathrm{w} \text { with } \ell=\mathrm{v}\} \cdot\{\ell\} & \xrightarrow{\delta} & \mathrm{v} & \\ \left\{\mathrm{w} \text { with } \ell^{\prime}=\mathrm{v}\right\} \cdot\{\ell\} & & \\ & \mathrm{w} .\{\ell\} & \text { if } \ell \neq \ell^{\prime} & \text { (RD-DEFAULT) } \\ \text { (RD-FOLLOw) } \end{array}{v}{}δv{w with =v}{}δv{w with =v}{}w.{} if  (RD-DEFAULT)  (RD-FOLLOw) 
Let the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 contain the following bindings
{ } : X . X Π ( X ) { with = } : X X Y . Π ( : X ; Y ) X Π ( : X ; Y ) { } : X Y . Π ( : X ; Y ) X { } : X . X Π ( X ) {  with  = } : X X Y . Π ( : X ; Y ) X Π : X ; Y { } : X Y . Π ( : X ; Y ) X {:[{*}:AAX.Xrarr Pi(delX)],[{*" with "ℓ=*}:AAXXY.Pi(ℓ:X;Y)rarrX^(')rarr Pi(ℓ:X^(');Y)],[*{ℓ}:AAXY.Pi(ℓ:X;Y)rarrX]:}\begin{aligned} \{\cdot\}: & \forall \mathrm{X} . \mathrm{X} \rightarrow \Pi(\partial \mathrm{X}) \\ \{\cdot \text { with } \ell=\cdot\}: & \forall \mathrm{XX} \mathrm{Y} . \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X}^{\prime} \rightarrow \Pi\left(\ell: \mathrm{X}^{\prime} ; \mathrm{Y}\right) \\ \cdot\{\ell\}: & \forall \mathrm{XY} . \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X} \end{aligned}{}:X.XΠ(X){ with =}:XXY.Π(:X;Y)XΠ(:X;Y){}:XY.Π(:X;Y)X
1.11.16 ExERcise [Full Records, , ] , ] *********,↛]\star \star \star, \nrightarrow],] : Check that these definitions meet the requirements of Definition 1.7.6.
1.11.17 ExErcise [Field Exchange, ******\star \star ]: Add an operation to permute two fields of a record: give the reduction rules and the typing assumptions and check that the requirements of Definition 1.7.6 are preserved.
1.11.18 ExErcise [Normal forms For Records, ************\star \star \star \star ]: Record values may contain repetitions. For instance, { { w { w {{w:}\left\{\{\mathrm{w}\right.{{w with = v } = v } ℓ=v}\ell=\mathrm{v}\}=v} with = v } = v {:ℓ=v^(')}\left.\ell=\mathrm{v}^{\prime}\right\}=v} is a value that is in fact observationally equivalent to { w w {w:}\left\{\mathrm{w}\right.{w with = v } = v {:ℓ=v^(')}\left.\ell=\mathrm{v}^{\prime}\right\}=v}. So are the two record values { { w { w {{w:}\left\{\{\mathrm{w}\right.{{w with = v } = v } ℓ=v}\ell=\mathrm{v}\}=v} with = v } = v {:ℓ^(')=v^(')}\left.\ell^{\prime}=\mathrm{v}^{\prime}\right\}=v} and { { w w {{w:}\left\{\left\{\mathrm{w}\right.\right.{{w with = v } = v {:ℓ^(')=v^(')}\left.\ell^{\prime}=\mathrm{v}^{\prime}\right\}=v} with = v } = v {:ℓ=v}\left.\ell=\mathrm{v}\right\}=v} when ℓ^(')!=ℓ\ell^{\prime} \neq \ell. Modify the semantics, so that two record values of the same type have similar structure and records do not contain inaccessible values.
1.11.19 ExERcise [Map Apply, ******\star \star ]: Add a binary operation map such that the expressions (map v w w w\mathrm{w}w ). { } { } {ℓ}\{\ell\}{} and v. { } { } {ℓ}\{\ell\}{} w. { } { } {ℓ}\{\ell\}{} reduce to the same value for every label \ell.
1.11.20 EXERCISE [ , ] [ , ] [***,↛][\star, \nrightarrow][,] : So far, full records are almost constants. This condition is not necessary for values, but only for types. As an example, introduce a primitive record, that is a nullary record constructor, that maps every label to a distinct integer. Give its typing assumption and review the semantics of records.
As opposed to full records, standard records are partial and their domains are finite (but with infinite carrier) and statically determined from their types. Standard records can be built by extending the empty record on a finite number of fields. We refer to such records as records with polymorphic extension. Record with polymorphic extension can be obtained by means of encoding into full records, much as in the finite case.
1.11.21 Example [Encoding of POLymorphic eXtension]: Reusing the two type definitions abs and pre that have introduced for the finite case, we let abs encodes an undefined field and prev encode a field with value v. We use the following syntactic sugar with their meaning and principal types:
= def { a b s } : Π ( a b s ) with = = def λ v λ w { w with = pre v } : X X Y . Π ( : X ; Y ) X Π ( : pre X ; Y ) = def λ v p r e 1 ( v { } ) : X Y . Π ( : pre X ; Y ) X =  def  { a b s } : Π ( a b s )  with  = =  def  λ v λ w { w  with  =  pre  v } : X X Y . Π ( : X ; Y ) X Π :  pre  X ; Y =  def  λ v p r e 1 ( v { } ) : X Y . Π ( :  pre  X ; Y ) X {:[(::)=^(" def "){abs}],[:Pi(delabs)],[(:*" with "ℓ=*:)=^(" def ")lambdav*lambdaw*{w" with "ℓ=" pre "v}],[:AAXX^(')Y.Pi(ℓ:X;Y)rarrX^(')rarr Pi(ℓ:" pre "X^(');Y)],[*(:ℓ:)=^(" def ")lambdav*pre^(-1)(v*{ℓ})],[:AAXY.Pi(ℓ:" pre "X;Y)rarrX]:}\begin{aligned} & \langle\rangle \stackrel{\text { def }}{=}\{a b s\} \\ & : \Pi(\partial \mathrm{abs}) \\ & \langle\cdot \text { with } \ell=\cdot\rangle \stackrel{\text { def }}{=} \lambda \mathrm{v} \cdot \lambda \mathrm{w} \cdot\{\mathrm{w} \text { with } \ell=\text { pre } \mathrm{v}\} \\ & : \forall \mathrm{XX}^{\prime} \mathrm{Y} . \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X}^{\prime} \rightarrow \Pi\left(\ell: \text { pre } \mathrm{X}^{\prime} ; \mathrm{Y}\right) \\ & \cdot\langle\ell\rangle \stackrel{\text { def }}{=} \lambda \mathrm{v} \cdot \mathrm{pre}^{-1}(\mathrm{v} \cdot\{\ell\}) \\ & : \forall \mathrm{XY} . \Pi(\ell: \text { pre } \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X} \end{aligned}= def {abs}:Π(abs) with == def λvλw{w with = pre v}:XXY.Π(:X;Y)XΠ(: pre X;Y)= def λvpre1(v{}):XY.Π(: pre X;Y)X
1.11.22 ExERcise [Recommended, ***\star ]: Extension may actually override an existing field. Can you define a version polymorphic extension that prevents this situation from happening? Add an operation that hide some particular field of a record.
Extensible records can also be implemented directly, without encoding into full records. In fact, this requires only a tiny variation on full records.
1.11.23 EXample [Records With Polymorphic eXtension]: Let ***\star and diamond\diamond be two basic kinds. Let the basic signature S 0 S 0 S_(0)\mathcal{S}_{0}S0 contain (in addition to rarr\rightarrow ) the covariant isolated type constructors pre of kind ***=>diamond\star \Rightarrow \diamond and abs of kind diamond\diamond. In the presence of subtyping, we may assume pre <=\leqslant abs. Let S 1 S 1 S_(1)\mathcal{S}_{1}S1 contain the unique covariant isolated type constructor Π Π Pi\PiΠ of kind diamond=>***\diamond \Rightarrow \star. Let \langle\rangle be a unary constructor, ( { with = ) L {  with  = ) L ({*" with "*=ℓ)^(ℓinL):}\left(\{\cdot \text { with } \cdot=\ell)^{\ell \in \mathcal{L}}\right.({ with =)L be a binary constructor, and ( . { } ) L ( . { } ) L (ℓ.{*})^(ℓinL)(\ell .\{\cdot\})^{\ell \in \mathcal{L}}(.{})L be a unary destructor with the following reduction rules:
w with = v . δ v (ER-FOUnd) w with = v . δ w. { } if (ER-FOLLOw) w  with  = v . δ v  (ER-FOUnd)   w with  = v . δ  w.  { }  if   (ER-FOLLOw)  {:[(:w" with "ℓ=v:).(:ℓ:)quadrarr"delta"quadvquad" (ER-FOUnd) "],[(:" w with "ℓ^(')=v:).(:ℓ:)quadrarr"delta"quad" w. "{ℓ}quad" if "ℓ!=ℓ^(')quad" (ER-FOLLOw) "]:}\begin{aligned} & \langle\mathrm{w} \text { with } \ell=\mathrm{v}\rangle .\langle\ell\rangle \quad \xrightarrow{\delta} \quad \mathrm{v} \quad \text { (ER-FOUnd) } \\ & \left\langle\text { w with } \ell^{\prime}=\mathrm{v}\right\rangle .\langle\ell\rangle \quad \xrightarrow{\delta} \quad \text { w. }\{\ell\} \quad \text { if } \ell \neq \ell^{\prime} \quad \text { (ER-FOLLOw) } \end{aligned}w with =v.δv (ER-FOUnd)  w with =v.δ w. {} if  (ER-FOLLOw) 
Let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 contain the following typing assumptions:
: Π ( a b s ) with = X X Y . Π ( : X ; Y ) X Π ( : pre X ; Y ) : X Y . Π ( : pre X ; Y ) X : Π ( a b s )  with  = X X Y . Π ( : X ; Y ) X Π :  pre  X ; Y : X Y . Π ( :  pre X  ; Y ) X {:[(::):Pi(delabs)],[(:*" with "ℓ=*:)AAXXY.Pi(ℓ:X;Y)rarrX^(')rarr Pi(ℓ:" pre "X^(');Y)],[*(:ℓ:):AAXY.Pi(ℓ:" pre X ";Y)rarrX]:}\begin{aligned} \langle\rangle: & \Pi(\partial \mathrm{abs}) \\ \langle\cdot \text { with } \ell=\cdot\rangle & \forall \mathrm{XX} \mathrm{Y} . \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X}^{\prime} \rightarrow \Pi\left(\ell: \text { pre } \mathrm{X}^{\prime} ; \mathrm{Y}\right) \\ \cdot\langle\ell\rangle: & \forall \mathrm{XY} . \Pi(\ell: \text { pre X } ; \mathrm{Y}) \rightarrow \mathrm{X} \end{aligned}:Π(abs) with =XXY.Π(:X;Y)XΠ(: pre X;Y):XY.Π(: pre X ;Y)X
Notice that the typing assumptions obtained from the direct approach are identical to those obtained via the encoding into full records in Example 1.11.21.
1.11.24 EXERCISE [ , ] [ , ] [************,↛][\star \star \star \star, \nrightarrow][,] : Prove the equivalence between the direct semantics and the semantics via the encoding into records with a default.
1.11.25 Exercise [Recommended, , , ******,↛\star \star, \nrightarrow, ]: Prove that type soundness for extensible records hold in both the subtyping model and equality-only models.
1.11.26 Exercise [Recommended, , , ***,↛\star, \nrightarrow, ]: Check that in the subtyping a record with more fields can be used in place of records with fewer fields. Check that this is not the case in the equality-only model.
1.11.27 EXAMPLE [REFINEMENT OF RECORD TYPES]: In an equality-only model, records with more fields cannot be used in place of records with fewer fields. However, this may be partially recovered by a small refinement of the structure of types. The presence of fields can actually be split form their types, thus enabling some polymorphism over the presence of fields while type of fields themselves remains fixed. Let o be a new basic kind. Let type constructors abs and pre be both of kind @\circ and let *\cdot be a new type constructor of kind @ox***=>diamond\circ \otimes \star \Rightarrow \diamond. Let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 contain the following typing assumptions:
: X . Π ( ( abs X ) ) with = : Z X X Y . Π ( : X ; Y ) X Π ( : Z X ; Y ) : X Y . Π ( : pre X ; Y ) X : X . Π ( (  abs  X ) )  with  = : Z X X Y . Π ( : X ; Y ) X Π : Z X ; Y : X Y . Π ( : pre X ; Y ) X {:[(::):AAX.Pi(del(" abs "*X))],[(:*" with "ℓ=*:):AAZXX^(')Y.Pi(ℓ:X;Y)rarrX^(')rarr Pi(ℓ:Z*X^(');Y)],[*(:ℓ:):AAXY.Pi(ℓ:pre*X;Y)rarrX]:}\begin{aligned} \langle\rangle: & \forall \mathrm{X} . \Pi(\partial(\text { abs } \cdot \mathrm{X})) \\ \langle\cdot \text { with } \ell=\cdot\rangle: & \forall \mathrm{ZXX}^{\prime} \mathrm{Y} . \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X}^{\prime} \rightarrow \Pi\left(\ell: \mathrm{Z} \cdot \mathrm{X}^{\prime} ; \mathrm{Y}\right) \\ \cdot\langle\ell\rangle: & \forall \mathrm{XY} . \Pi(\ell: \operatorname{pre} \cdot \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X} \end{aligned}:X.Π(( abs X)) with =:ZXXY.Π(:X;Y)XΠ(:ZX;Y):XY.Π(:preX;Y)X
The semantics of records remain unchanged. The new signature strictly generalizes the previous one (strictly more programs can be typed) while preserving type soundness. Here is a program that can now be typed and that could not be typed before:
(if a then with = true) with = 1 else with = 2 ).  (if  a  then   with  =  true) with  = 1  else   with  = 2  ).  " (if "a" then "(:(:(::)" with "ℓ^(')=" true) with "ℓ=1:)" else "(:(::)" with "ℓ=2:)" ). "(:ℓ:):}\text { (if } a \text { then }\left\langle\left\langle\langle\rangle \text { with } \ell^{\prime}=\text { true) with } \ell=1\right\rangle \text { else }\langle\langle\rangle \text { with } \ell=2\rangle \text { ). }\langle\ell\rangle\right. (if a then  with = true) with =1 else  with =2 ). 
Notice however, that when a present field is forgotten, the type of the field is not. Therefore two records defining the same field but with incompatible types can still not be mixed, which is possible in the subtyping model.
1.11.28 EXAmple [REFINED SUbTypIng]: The previous refinement for an equalityonly model is not much interesting in the case of a subtyping model.
The subtyping assumption pre <=\leqslant abs makes abs play the role of T T TTT for fields. That is, abs encodes the absence of information and not the information of absence. In other words, a value whose field \ell has type abs may either be undefined or defined on field \ell; in the latter case, the fact that field \ell is actually defined has just been forgotten. Thus, types only provides a lower approximation of the actual domain of records. This is a lost of accuracy by comparison with the equality-only model, where a record domain is known from its type. As a result, some optimizations in the representation of records that are only possible when the exact domain of a record is statically known are lost.
Fortunately, there is a way to recover such accuracy. A conservative solution could of course to drop the inequality pre <=\leqslant abs. Notice that this would still be more expressive than using an equality model since, for instance Π ( Π ( Pi(ℓ\Pi(\ellΠ( : pre ( T 1 T 2 ) ; T ) Π ( T 1 T 2 ; T Π ( {:(T_(1)rarrT_(2));T) <= Pi(ℓ\left.\left(\mathrm{T}_{1} \rightarrow \mathrm{T}_{2}\right) ; \mathrm{T}\right) \leq \Pi(\ell(T1T2);T)Π( : pre ; T ) ; T ) TT;T)\top ; \mathrm{T});T) would still hold, as long as →≤ →≤ rarr <= TT\rightarrow \leq \top→≤ does hold. This solution is known as depth-only subtyping for records, while the previous one provided both depth and width record subtyping. Conversely, one could also keep width subtyping and disallow depth subtyping, by preserving
the relation pre <=\leqslant abs while requiring pre to be invariant; in this case, presence of fields can be forgotten as a whole, but the types of fields cannot be weakened as long as fields remain visible.
Another more interesting solution consists in introducing another type constructor either of signature diamond\diamond and assuming that pre <=\leqslant either and abs <=\leqslant either (but pre \nless abs). Here, either plays the role of T T TTT for fields and means either present (and forgotten) or absent. while abs really means absent. The accuracy of typechecking can be formally stated as the fact that a record value of type Π ( Π ( Pi(ℓ\Pi(\ellΠ( : abs ; T ) ; T ) ;T); \mathrm{T});T) cannot define field \ell.
1.11.29 Example [miXed subtyping]: It is tempting to mix all variations of Example 1.11.28 together. As a first attempt, we may assume that the basic signature S 0 S 0 S_(0)\mathcal{S}_{0}S0 contains covariant type constructors pre and maybe and invariant type constructors pre = = _(=)_{=}=and maybe = = ===, all of kind ***=>diamond\star \Rightarrow \diamond and two type constructors abs and either of kind diamond\diamond, and that the subtype ordering <=\leqslant is defined by the following diagram:
Intuitively, we wish that pre = = === and maybe = = === be logically invariant, pre and maybe be logically covariant, and the equivalences pre = T = T =T <==\mathrm{T} \leq=T maybe = T T = T = T T = T _(=)T^(')-=T=T^(')_{=} \mathrm{T}^{\prime} \equiv \mathrm{T}=\mathrm{T}^{\prime}=TT=T and
(1) pre = T pre T pre T maybe T maybe = T maybe T T T (1)  pre  = T  pre  T  pre  T  maybe  T  maybe  = T  maybe  T T T {:(1)" pre "=T <= " pre "T^(')-=" pre "T <= " maybe "T^(')-=" maybe "_(=)T <= " maybe "T^(')-=T <= T^('):}\begin{equation*} \text { pre }=\mathrm{T} \leq \text { pre } \mathrm{T}^{\prime} \equiv \text { pre } \mathrm{T} \leq \text { maybe } \mathrm{T}^{\prime} \equiv \text { maybe }_{=} \mathrm{T} \leq \text { maybe } \mathrm{T}^{\prime} \equiv \mathrm{T} \leq \mathrm{T}^{\prime} \tag{1} \end{equation*}(1) pre =T pre T pre T maybe T maybe =T maybe TTT
simultaneously hold. However, (1) requires, for instance, type constructors pre = = === and pre to have the same direction, which is not currently possible since they do not have the same variance. Interestingly, this restriction may be relaxed by assigning variances of directions on a per type constructor basis and define structural subtyping accordingly (See Exercise 1.11.30). Then, replacing all occurrences of pre by pre = = === in Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 preserves type soundness and allows for both accurate record types and flexible subtyping in the same setting.
1.11.30 ExERcise [RELaXed variances, , ] , ] *********,↛]\star \star \star, \nrightarrow],] : Let O/\emptyset be allowed as a new variance, let extend the composition of variances defined in Example 1.3.9 with ν = ν = nu O/=O/\nu \emptyset=\emptysetν=, and let <= O/\leqslant \emptyset stands for the full relation on type constructors. Let each type constructor F F FFF of signature K κ K κ K=>kappaK \Rightarrow \kappaKκ now come with a mapping ϑ ( F ) ϑ ( F ) vartheta(F)\vartheta(F)ϑ(F) from dom ( K ) dom ( K ) dom(K)\operatorname{dom}(K)dom(K) to variances. Let ϑ ( t , t , π ) ϑ t , t , π vartheta(t,t^('),pi)\vartheta\left(t, t^{\prime}, \pi\right)ϑ(t,t,π) be the variance of two
ground types t t ttt and t t t^(')t^{\prime}t at a path π π pi\piπ recursively defined by ϑ ( t , t , d π ) = ϑ t , t , d π = vartheta(t,t^('),d*pi)=\vartheta\left(t, t^{\prime}, d \cdot \pi\right)=ϑ(t,t,dπ)= ( ϑ ( t ( ϵ ) ) ( d ) ϑ ( t ( ϵ ) ) ( d ) ) ϑ ( t / d , t / d , π ) ϑ ( t ( ϵ ) ) ( d ) ϑ t ( ϵ ) ( d ) ϑ t / d , t / d , π (vartheta(t(epsilon))(d)nn vartheta(t^(')(epsilon))(d))vartheta(t//d,t^(')//d,pi)\left(\vartheta(t(\epsilon))(d) \cap \vartheta\left(t^{\prime}(\epsilon)\right)(d)\right) \vartheta\left(t / d, t^{\prime} / d, \pi\right)(ϑ(t(ϵ))(d)ϑ(t(ϵ))(d))ϑ(t/d,t/d,π) and ϑ ( t , t , ϵ ) = + ϑ t , t , ϵ = + vartheta(t,t^('),epsilon)=+\vartheta\left(t, t^{\prime}, \epsilon\right)=+ϑ(t,t,ϵ)=+. Then define the interpretation of subtyping as follows: if t , t M κ t , t M κ t,t^(')inM_(kappa)t, t^{\prime} \in \mathcal{M}_{\kappa}t,tMκ, let t t t t t <= t^(')t \leq t^{\prime}tt hold if and only if for all path π dom ( t ) dom ( t ) , t ( π ) ϑ ( t , t , π ) t ( π ) π dom ( t ) dom t , t ( π ) ϑ t , t , π t ( π ) pi in dom(t)nn dom(t^(')),t(pi) <= ^(vartheta(t,t^('),pi))t^(')(pi)\pi \in \operatorname{dom}(t) \cap \operatorname{dom}\left(t^{\prime}\right), t(\pi) \leqslant^{\vartheta\left(t, t^{\prime}, \pi\right)} t^{\prime}(\pi)πdom(t)dom(t),t(π)ϑ(t,t,π)t(π) holds.
Check that the relation <=\leq remains a partial ordering. Check that a type constructor whose direction d d ddd has been syntactically declared covariant (respectively contravariant, invariant) is still logically covariant (respectively contravariant, invariant) in d d ddd.

Record concatenation

Record concatenation takes two records and combines them into a new record whose fields are taken from whatever argument defines them. Of course, there is an ambiguity when the two records do not have disjoint domains and a choice should be made to disambiguate such cases. Symmetric concatenation let concatenation be undefined in this case (Harper and Pierce, 1991), while asymmetric concatenation let one-side (usually the right side) always take priority. Despite a rather simple semantics, record concatenation remains hard to type (with either a strict or a priority semantics). Solutions to type inference for record concatenation may be found, for instance, in (Wand, 1989; Rémy, 1992; Pottier, 2000).

Polymorphic variants

Variants can be defined via algebraic data-type definitions. However, as fields for records, variant tags are taken from a relatively small, finite collection of labels and two variant definitions will have incompatible types. Thus, to remain compatible, two variants must chose their tag among a larger collection that is a superset of all the possible tags of either variant. In general, this reduces the accuracy of types and forces useless dynamic checks for tags that could otherwise be known not to occur. Extensible variants (page 93) allow to work with an arbitrary large collection of tags, but do not improve accuracy. Polymorphic variants refers to a more precise typechecking mechanism for variants, where types more accurately describes the tags that may actually occur. They allow to build values of sum types out of a large, potentially infinite predefined set of tags and call polymorphic functions to explore them. As for record, this problem could be tackled by first considering polymorphic operations over variants built from a finite set of tags and total variants with an infinite set of tag independently and then by combining both approaches together. We propose a direct solution by a simple analogy with records.
Indeed, type constructor pre can be used to distinguish a (finite) set of tags
that the variant may actually carry, from other tags that are certain not to occur and typed with abs. For example, a variant \ell.v, built from a value v v v\mathrm{v}v with a constructor tag \ell of arity one. may be assigned the principal type scheme X . Σ ( : X . Σ ( : AAX.Sigma(ℓ:\forall \mathrm{X} . \Sigma(\ell:X.Σ(: pre T ; X ) T ; X ) T;X)\mathrm{T} ; \mathrm{X})T;X) where T T T\mathrm{T}T is the type of v v v\mathrm{v}v. The unary type constructor Σ Σ Sigma\SigmaΣ is used to coerce rows to variant types-thus, variant types and record types may share the same inner row structure and be simply distinguished by their top symbol. An instance of this polymorphic type is X . Σ ( X . Σ ( AAX.Sigma(ℓ\forall \mathrm{X} . \Sigma(\ellX.Σ( :pre T T T\mathrm{T}T; abs ) ) ))), which tells that the variant must have been built with tag \ell and no other tag, thus retaining exact information about the shape of the value. Another instance of the variant polymorphic type is Σ ( : Σ : Sigma(ℓ::}\Sigma\left(\ell:\right.Σ(: pre T ; : T ; : T;ℓ^('):\mathrm{T} ; \ell^{\prime}:T;: pre T T T^(')\mathrm{T}^{\prime}T; abs ) ) ))). Indeed, it is sound to assume that the value might also have been built with some other tag tag tag ℓ^(')\operatorname{tag} \ell^{\prime}tag, even if we know that this is not actually the case. Interestingly, both values \ell.v and ℓ^(')\ell^{\prime}.v have this type and can be mixed at this type.
We use filters to explore variants. A filter [ : v v ] : v v [ℓ:v∣v^(')]\left[\ell: \mathrm{v} \mid \mathrm{v}^{\prime}\right][:vv] is a function that expects a variant argument, thus of the form ℓ^(')\ell^{\prime}.w. It then proceeds with either v w v w vw\mathrm{v} \mathrm{w}vw, if = = ℓ^(')=ℓ\ell^{\prime}=\ell=, or v w v w v^(')w\mathrm{v}^{\prime} \mathrm{w}vw otherwise. The type of this filter is Σ ( : Σ : Sigma(ℓ::}\Sigma\left(\ell:\right.Σ(: pre T ; T ) T T ; T T {:T;T^('))rarrT^('')\left.\mathrm{T} ; \mathrm{T}^{\prime}\right) \rightarrow \mathrm{T}^{\prime \prime}T;T)T where T T T\mathrm{T}T is the type of values accepted by v , Σ ( : T ; T ) v , Σ : T ; T v,Sigma(ℓ:T^(''');T^('))\mathrm{v}, \Sigma\left(\ell: \mathrm{T}^{\prime \prime \prime} ; \mathrm{T}^{\prime}\right)v,Σ(:T;T) is the type of values accepted by v v v^(')\mathrm{v}^{\prime}v, and T T T^('')\mathrm{T}^{\prime \prime}T is the type of values returned by both v v v\mathrm{v}v and v v v^(')\mathrm{v}^{\prime}v. Any type T T T^(''')\mathrm{T}^{\prime \prime \prime}T would do, including, in particular, abs. Indeed, when w is passed to v v v^(')\mathrm{v}^{\prime}v, it is known not to have tag \ell, so the behavior of v v v^(')\mathrm{v}^{\prime}v on \ell does not matter. The null filter [] can be used for v v v^(')\mathrm{v}^{\prime}v. This filter should actually never be applied, which we ensure by assigning \square the type X . Σ ( a b s ) X X . Σ ( a b s ) X AAX.Sigma(delabs)rarrX\forall \mathrm{X} . \Sigma(\partial \mathrm{abs}) \rightarrow \mathrm{X}X.Σ(abs)X, for no variant value has type Σ ( a b s ) Σ ( a b s ) Sigma(delabs)\Sigma(\partial \mathrm{abs})Σ(abs). For instance, the filter [ : v [ : v [ ] ] ] : v : v [ ] [ℓ:v_(ℓ)∣[ℓ^('):v_(ℓ^('))∣[]]]\left[\ell: \mathrm{v}_{\ell} \mid\left[\ell^{\prime}: \mathrm{v}_{\ell^{\prime}} \mid[]\right]\right][:v[:v[]]], which may be abbreviated as [ : v : v ] : v : v [ℓ:v_(ℓ)∣ℓ^('):v_(ℓ^('))]\left[\ell: \mathrm{v}_{\ell} \mid \ell^{\prime}: \mathrm{v}_{\ell^{\prime}}\right][:v:v] can be applied to either \ell.v or . v . v ℓ^(').v^(')\ell^{\prime} . \mathrm{v}^{\prime}.v. The following example formalizes polymorphic variants.
1.11.31 Example [Polymorphic variants]: Let ***\star and diamond\diamond be two basic kinds. Let S S S\mathcal{S}S contain in addition to the arrow type constructor the two type constructors pre of kind ***=>diamond\star \Rightarrow \diamond and abs of kind diamond\diamond. In the presence of subtyping we may assume abs <=\leqslant pre. Let S 1 S 1 S_(1)\mathcal{S}_{1}S1 contain the unique covariant isolated type constructor Σ Σ Sigma\SigmaΣ of kind diamond=>***\diamond \Rightarrow \star. Let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 be composed of unary constructors ( . ) L ( . ) L (ℓ.*)^(ℓinL)(\ell . \cdot)^{\ell \in \mathcal{L}}(.)L and primitives [ of arity 0 and ( [ : ] ) L ( [ : ] ) L ([ℓ:*∣*]*)^(ℓinL)([\ell: \cdot \mid \cdot] \cdot)^{\ell \in \mathcal{L}}([:])L of arity 3 , given with the following reduction rules:
[ : v v ] . w δ v w [ : v v ] . w δ v w if : v v . w δ v w : v v . w δ v w  if  {:[[ℓ:v∣v^(')]ℓ.wquadrarr"delta"quadvw],[[ℓ:v∣v^(')]ℓ^(').wquadrarr"delta"quadv^(')wquad" if "ℓ!=ℓ^(')]:}\begin{aligned} & {\left[\ell: \mathrm{v} \mid \mathrm{v}^{\prime}\right] \ell . \mathrm{w} \quad \xrightarrow{\delta} \quad \mathrm{v} \mathrm{w}} \\ & {\left[\ell: \mathrm{v} \mid \mathrm{v}^{\prime}\right] \ell^{\prime} . \mathrm{w} \quad \xrightarrow{\delta} \quad \mathrm{v}^{\prime} \mathrm{w} \quad \text { if } \ell \neq \ell^{\prime}} \end{aligned}[:vv].wδvw[:vv].wδvw if 
and contain the following typing assumptions:
. : X Y . X Σ ( : pre X ; Y ) [ ] : X . Σ ( a b s ) X [ : ] : X X Y Y Y . ( X Y ) ( Σ ( : X ; Y ) Y ) Σ ( : pre X ; Y ) Y . : X Y . X Σ ( :  pre  X ; Y ) [ ] : X . Σ ( a b s ) X [ : ] : X X Y Y Y . ( X Y ) Σ : X ; Y Y Σ : pre X ; Y Y {:[ℓ.*:AAXY.Xrarr Sigma(ℓ:" pre "X;Y)],[[]:AAX.Sigma(delabs)rarrX],[[ℓ:*∣*]:AAXX^(')YYY^(').(XrarrY)rarr(Sigma(ℓ:X^(');Y^('))rarrY)rarr Sigma(ℓ:pre X;Y^('))rarrY]:}\begin{aligned} \ell . \cdot: & \forall \mathrm{XY} . \mathrm{X} \rightarrow \Sigma(\ell: \text { pre } \mathrm{X} ; \mathrm{Y}) \\ {[]: } & \forall \mathrm{X} . \Sigma(\partial \mathrm{abs}) \rightarrow \mathrm{X} \\ {[\ell: \cdot \mid \cdot]: } & \forall \mathrm{XX}^{\prime} \mathrm{YY} \mathrm{Y}^{\prime} .(\mathrm{X} \rightarrow \mathrm{Y}) \rightarrow\left(\Sigma\left(\ell: \mathrm{X}^{\prime} ; \mathrm{Y}^{\prime}\right) \rightarrow \mathrm{Y}\right) \rightarrow \Sigma\left(\ell: \operatorname{pre} \mathrm{X} ; \mathrm{Y}^{\prime}\right) \rightarrow \mathrm{Y} \end{aligned}.:XY.XΣ(: pre X;Y)[]:X.Σ(abs)X[:]:XXYYY.(XY)(Σ(:X;Y)Y)Σ(:preX;Y)Y
1.11.32 ExERcise [Soundness for Extensible Variants, , , ************,↛\star \star \star \star, \nrightarrow, ]: Prove type soundness for extensible variants in both equality-only and subtyping models.

Other applications of rows

Polymorphic records and variants are the most well-known applications of rows. Besides the many variations on their presentations - we have only illustrated some of them-there are several other interesting applications of rows.
Since objects can be viewed as record-of-functions, at least from a typing point of view, rows can also be used to type structural objects (Wand, 1994; Rémy, 1994; Rémy and Vouillon, 1998) and provide, in particular, polymorphic method invocation. This is the key to typechecking objects in Objective Caml (Rémy and Vouillon, 1998). First-class messages (Nishimura, 1998; Müller and Nishimura, 1998; Pottier, 2000) combine records and variants in an interesting way: while filters over variant types enforce all branches to have the same return type, first-class messages treat filters as records of functions (also called objects) rather than functions from a variant type to a shared return type. A message is an element of a variant type. The application of an object to a message, that is of a record of functions to a variant type, selects from the record the branch labeled with the same tag as the message and applies it to the content of the message, much as pattern matching. However, these applications are typechecked more accurately by first restricting the domain of the record to the set of tags that the message may possibly carry, and thus other branches and in particular their return type are left unconstrained.
Row types may also represent set of properties within types or type refinements and be used in type systems for program analysis. Two examples worth mentioning are their application to soft-typing (Cartwright and Fagan, 1991; Wright and Cartwright, 1994) and typechecking of uncaught exceptions (Leroy and Pessaux, 2000).
The key to rows is to decompose the set of row labels into a class of finite partitions that is closed by some operations. Here, those partitions are composed of singleton labels and co-finite sets of labels; the operations are merging (or conversely splitting) a singleton label and a co-finite set of labels. Other decompositions are possible, for instance, one could imagine to consider labels in a two-dimensional space. More generally, labels might also be given internal structure, for instance, one might consider automatons as labels. Notice also that record types are stratified, since rows, that is, expressions of kind Row ( L ) Row ( L ) Row(L)\operatorname{Row}(L)Row(L), may not themselves contain records - constructors of S 1 S 1 S_(1)\mathcal{S}_{1}S1 are only given the image row kind Type. This restriction can be partially
relaxed leading to rows of increasing degrees (Rémy, 1992b) ... and complexity! Yet more intriguing are typed-indexed rows where labels are themselves types (Shields and Meijer, 2001).

Alternatives to rows

The original idea of using rows to describe types of extensible records is due to Wand (Wand, 1987, 1988). A key simplification to row types is to make them total functions from labels to types and encode definiteness explicitly in the structure of fields, for instance with pre and abs type constructors, as presented here. This decomposition reduces the resolution of unification constraints to a simple equational reasoning (Rémy, 1993, 1992a). Other approaches that do not treat rows as total functions seem more a d a d ada dad hoc and have often hard-wired restrictions (Jategaonkar and Mitchell, 1988; Ohori and Buneman, 1989; Berthomieu, 1993; Ohori, 1999). Among these partial solutions, (Ohori, 1999) is quite interesting for its overall simplicity in the case where polymorphic access alone is required. Rows and fields may also be represented within ad-hoc type constraints rather than terms and equality (or subtyping) constraints. For example, qualified types use the predicates ( T T T\mathrm{T}T has : T : T ℓ:T^(')\ell: \mathrm{T}^{\prime}:T ) and ( T T T\mathrm{T}T lacks \ell ) to mean that field \ell of row T T T\mathrm{T}T is defined with type T T T^(')\mathrm{T}^{\prime}T or undefined, respectively (Jones, 1994b; Odersky, Sulzmann, and Wehr, 1999b). These constraints are in fact equivalent in our equality-model to X . T = ( X . T = EEX.T=(ℓ:}\exists \mathrm{X} . \mathrm{T}=\left(\ell\right.X.T=( :pre T ; X ) T ; X {:T^(');X)\left.\mathrm{T}^{\prime} ; \mathrm{X}\right)T;X) and X . T = ( : X . T = ( : EEX.T=(ℓ:\exists \mathrm{X} . \mathrm{T}=(\ell:X.T=(: abs ; X ) ; X ) ;X); \mathrm{X});X), respectively. Record typechecking has also been widely studied in the presence of subtyping. Usually, record subtyping is given meaning directly and not via rows. While these solutions are quite expressive, thanks to subtyping, they still suffer from their nonstructural treatment of record types and cannot type row extension. Thus, even in subtyping models the use of rows increases expressiveness, and is usually a simplification as well. The subtyping model can then also take advantage of the possibility of enriching type constructors pre and abs with more structure and relate them via subtyping (Pottier, 2000). Notice, that even though rows have been introduced for type inference, they seem to be beneficial to explicitly typed languages as well since even other advanced solutions (Cardelli and Mitchell, 1991; Cardelli, 1992) are limited.
Rules of Figure 1-19 are one way of solving row type constraints. In a model with subtyping constraints, a more direct closure-based resolution may be more appropriate (Pottier, 2003).
B Solutions to Selected Exercises
1.2.6 Solution: The definition does not behave as expected, because if is a destructor, whose arguments-according to the call-by-value semantics of MLthe-calculus-are evaluated before R-TRUE or R-FALSE is allowed to fire. As a result, the semantics of the expression if t 0 t 0 t_(0)t_{0}t0 then t 1 t 1 t_(1)t_{1}t1 else t 2 t 2 t_(2)t_{2}t2 is to evaluate both t 1 t 1 t_(1)t_{1}t1 and t 2 t 2 t_(2)t_{2}t2 before choosing one of them. Since these expressions may have side effects (for instance, they may fail to terminate, or update a reference), this semantics is undesirable. The desired evaluation order can be obtained by placing t 1 t 1 t_(1)t_{1}t1 and t 2 t 2 t_(2)t_{2}t2 within closures, which delays their evaluation, then invoking the closure returned by the conditional, forcing its body to be evaluated. In other words, the expression if t 0 t 0 t_(0)t_{0}t0 then t 1 t 1 t_(1)t_{1}t1 else t 2 t 2 t_(2)t_{2}t2 should now be viewed as syntactic sugar for if t 0 ( λ t 0 λ t_(0)(lambda:}t_{0}\left(\lambda\right.t0(λ z.t 1 ) ( λ z . t 2 ) 0 ^ 1 λ z . t 2 0 ^ {:_(1))(lambda z.t_(2)) hat(0)\left._{1}\right)\left(\lambda z . t_{2}\right) \hat{0}1)(λz.t2)0^. The choice of the constant 0 ^ 0 ^ hat(0)\hat{0}0^ is arbitrary, since it is discarded; any value would do.
1.2.21 Solution: Within Damas and Milner's type system, we have:
Please note that, because X X X\mathrm{X}X occurs free within the environment z 1 : X z 1 : X z_(1):X\mathrm{z}_{1}: \mathrm{X}z1:X, it is impossible to apply DM-GEN to the judgement z 1 : X z 1 : X z 1 : X z 1 : X z_(1):X|--z_(1):Xz_{1}: X \vdash z_{1}: Xz1:Xz1:X in a nontrivial way. For this reason, z 2 z 2 z_(2)z_{2}z2 cannot receive the type scheme x x AA x\forall xx.X, and the whole expression cannot receive type X Y X Y XrarrY\mathrm{X} \rightarrow \mathrm{Y}XY, where X X X\mathrm{X}X and Y Y Y\mathrm{Y}Y are distinct.
1.2.22 Solution: It is straightforward to prove that the identity function has type int rarr\rightarrow int:
Γ 0 ; z : int z : int ¯ Γ 0 λ z . z : int int DM-ABS Γ 0 ; z :  int  z :  int  ¯ Γ 0 λ z . z :  int   int   DM-ABS  ( bar(Gamma_(0);z:" int "|--z:" int "))/(Gamma_(0)|--lambdaz.z:" int "rarr" int ")" DM-ABS "\frac{\overline{\Gamma_{0} ; z: \text { int } \vdash \mathrm{z}: \text { int }}}{\Gamma_{0} \vdash \lambda \mathrm{z} . \mathrm{z}: \text { int } \rightarrow \text { int }} \text { DM-ABS }Γ0;z: int z: int ¯Γ0λz.z: int  int  DM-ABS 
In fact, nothing in this type derivation depends on the choice of int as the type of z z zzz. Thus, we may just as well use a type variable X X XXX instead. Furthermore, after forming the arrow type X X X X XrarrX\mathrm{X} \rightarrow \mathrm{X}XX, we may employ DM-GEN to quantify universally over X X X\mathrm{X}X, since it no longer appears in the environment.
It is worth noting that, although the type derivation employs an arbitrary type variable X X X\mathrm{X}X, the final typing judgement has no free type variables. It is
thus independent of the choice of x x x\mathrm{x}x. In the following, we refer to the above type derivation as Δ 0 Δ 0 Delta_(0)\Delta_{0}Δ0.
Next, we prove that the successor function has type int rarr\rightarrow int under the initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0. We write Γ 1 Γ 1 Gamma_(1)\Gamma_{1}Γ1 for Γ 0 ; z Γ 0 ; z Gamma_(0);z\Gamma_{0} ; zΓ0;z : int, and make uses of DM-VAR implicit.
In the following, we refer to the above type derivation as Δ 1 Δ 1 Delta_(1)\Delta_{1}Δ1. We may now build a derivation for the third typing judgement. We write Γ 2 Γ 2 Gamma_(2)\Gamma_{2}Γ2 for Γ 0 ; f Γ 0 ; f Gamma_(0);f\Gamma_{0} ; fΓ0;f : int rarr\rightarrow int.
Δ 1 Γ 2 f : int int Γ 2 2 ^ : int Γ 2 f 2 ^ : int Γ 0 let f = λ z z + ^ 1 ^ in f 2 ^ : int DM-LET Δ 1 Γ 2 f :  int   int  Γ 2 2 ^ :  int  Γ 2 f 2 ^ :  int  Γ 0  let  f = λ z z + ^ 1 ^  in  f 2 ^ :  int   DM-LET  (Delta_(1)(Gamma_(2)|--f:" int "rarr" int "quadGamma_(2)|--( hat(2)):" int ")/(Gamma_(2)|--f( hat(2)):" int "))/(Gamma_(0)|--" let "f=lambdaz*z( hat(+))( hat(1))" in "f( hat(2)):" int ")" DM-LET "\frac{\Delta_{1} \frac{\Gamma_{2} \vdash \mathrm{f}: \text { int } \rightarrow \text { int } \quad \Gamma_{2} \vdash \hat{2}: \text { int }}{\Gamma_{2} \vdash \mathrm{f} \hat{2}: \text { int }}}{\Gamma_{0} \vdash \text { let } \mathrm{f}=\lambda \mathrm{z} \cdot \mathrm{z} \hat{+} \hat{1} \text { in } \mathrm{f} \hat{2}: \text { int }} \text { DM-LET }Δ1Γ2f: int  int Γ22^: int Γ2f2^: int Γ0 let f=λzz+^1^ in f2^: int  DM-LET 
To derive the fourth typing judgement, we re-use Δ 0 Δ 0 Delta_(0)\Delta_{0}Δ0, which proves that the identity function has polymorphic type. We write Γ 3 Γ 3 Gamma_(3)\Gamma_{3}Γ3 for Γ 0 ; f : X . X X Γ 0 ; f : X . X X Gamma_(0);f:AAX.XrarrX\Gamma_{0} ; f: \forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}Γ0;f:X.XX. By DM-VAR and DM-INST, we have Γ 3 f : ( Γ 3 f : ( Gamma_(3)|--f:(\Gamma_{3} \vdash \mathrm{f}:(Γ3f:( int rarr\rightarrow int ) ( ) ( )rarr() \rightarrow()( int rarr\rightarrow int) and Γ 3 f : Γ 3 f : Gamma_(3)|--f:\Gamma_{3} \vdash f:Γ3f: int rarr\rightarrow int. Thus, we may build the following derivation:
Γ 3 f : ( int int ) ( int int ) Γ 3 f : (  int   int  ) (  int   int  ) {:Gamma_(3)|--f:(" int "rarr" int ")rarr(" int "rarr" int "):}\begin{aligned} & \Gamma_{3} \vdash \mathrm{f}:(\text { int } \rightarrow \text { int }) \rightarrow(\text { int } \rightarrow \text { int }) \end{aligned}Γ3f:( int  int )( int  int )
The first and third judgements are valid in the simply-typed λ λ lambda\lambdaλ-calculus, because they use neither DM-GEN nor DM-INST, and use DM-LET only to introduce the monomorphic binding f : f : f:\mathrm{f}:f: int rarr\rightarrow int into the environment. The second judgement, of course, is not: because it involves a nontrivial type scheme, it is not even a well-formed judgement in the simply-typed λ λ lambda\lambdaλ-calculus. The fourth judgement is well-formed, but not derivable, in the simply-typed λ λ lambda\lambdaλ-calculus. This is because f f fff is used at two incompatible types, namely (int rarr\rightarrow int) rarr\rightarrow (int rarr\rightarrow int) and int rarr\rightarrow int, inside the expression f f 2 ^ f f 2 ^ ff hat(2)\mathrm{f} f \hat{2}ff2^. Both of these types are instances of X . X X X . X X AAX.XrarrX\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}X.XX, the type scheme assigned to f f fff in the environment Γ 3 Γ 3 Gamma_(3)\Gamma_{3}Γ3.
By inspection of DM-VAR, DM-GEN, and DM-INST, it is straightforward to see that, if Γ 0 1 ^ : T Γ 0 1 ^ : T Gamma_(0)|-- hat(1):T\Gamma_{0} \vdash \hat{1}: \mathrm{T}Γ01^:T is derivable, then T T T\mathrm{T}T must be int. Since int is not an arrow type, the application 1 ^ 2 ^ 1 ^ 2 ^ hat(1) hat(2)\hat{1} \hat{2}1^2^ cannot be well-typed under Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0. In fact, because this expression is stuck, it cannot be well-typed in a sound type system.
The expression λ f λ f lambda f\lambda fλf. ( f f ) ( f f ) (ff)(f f)(ff) is ill-typed in the simply-typed λ λ lambda\lambdaλ-calculus, because no type T T T\mathrm{T}T may coincide with a type of the form T T T T TrarrT^(')\mathrm{T} \rightarrow \mathrm{T}^{\prime}TT. Indeed, T T T\mathrm{T}T would then be a subterm of itself. For the same reason, this expression is ill-typed in D M D M DM\mathrm{DM}DM as well. Indeed, it is not difficult to check that the presence of DM-GEN and DM-INST makes no difference: DM-GEN cannot generalize T T T\mathrm{T}T as long as the binding f : T f : T f:Tf: Tf:T appears in the environment, and DM-INST can only instantiate T T TTT to T T TTT itself. Thus, the self-application f f f f fff fff is well-typed in DM only if f f fff is let-bound, as opposed to λ λ lambda\lambdaλ-bound. The argument crucially relies on the fact that f f fff must be assigned a monotype. Indeed, the expression λ f . ( f f ) λ f . ( f f ) lambda f.(ff)\lambda f .(f f)λf.(ff) is well-typed in an implicitly-typed variant of System F: one of its types is ( X . X X ) ( X . X X ) ( X . X X ) ( X . X X ) (AAX.XrarrX)rarr(AAX.XrarrX)(\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}) \rightarrow(\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X})(X.XX)(X.XX). It also relies on the fact that types are finite: indeed, this expression is well-typed in an extension of the simply-typed λ λ lambda\lambdaλ calculus with recursive types, where the equation T = T T T = T T T=TrarrT^(')\mathrm{T}=\mathrm{T} \rightarrow \mathrm{T}^{\prime}T=TT has a solution.
1.2.23 Solution: It is clear that the effect of DM-GEN may be obtained by a series of successive applications of DM-GEN'. Conversely, consider an instance of DM-GEN', whose premises are Γ t : S ( 1 ) Γ t : S ( 1 ) Gamma|--t:S(1)\Gamma \vdash \mathrm{t}: \mathrm{S}(\mathbf{1})Γt:S(1) and X f t v ( Γ ) X f t v ( Γ ) X!in ftv(Gamma)\mathrm{X} \notin f t v(\Gamma)Xftv(Γ) (2). Let us write S = X ¯ . T S = X ¯ . T S=AA bar(X).T\mathrm{S}=\forall \overline{\mathrm{X}} . \mathrm{T}S=X¯.T, where X ¯ # f t v ( Γ ) X ¯ # f t v ( Γ ) bar(X)#ftv(Gamma)\overline{\mathrm{X}} \# f t v(\Gamma)X¯#ftv(Γ) (3). Applying DM-Inst to (1) and to the identity substitution yields Γ t : T Γ t : T Gamma|--t:T\Gamma \vdash \mathrm{t}: \mathrm{T}Γt:T (4). Applying DM-GEN to (4), (2) and (3) yields Γ t : x x ¯ . T Γ t : x x ¯ . T Gamma|--t:AA x bar(x).T\Gamma \vdash t: \forall x \bar{x} . TΓt:xx¯.T, that is, Γ t : X Γ t : X Gamma|--t:AA X\Gamma \vdash t: \forall XΓt:X.S. Thus, the effect of DM-GEN' may be obtained by DM-INST and DM-GEN.
It is clear that DM-INST is a particular case of DM-INST' where Y ¯ Y ¯ bar(Y)\bar{Y}Y¯ is empty. Conversely, consider an instance of DM-INST', whose premises are Γ t Γ t Gamma|--t\Gamma \vdash tΓt : X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯.T (1) and Y ¯ # ftv ( X ¯ Y ¯ # ftv ( X ¯ bar(Y)#ftv(AA bar(X)\overline{\mathrm{Y}} \# \operatorname{ftv}(\forall \overline{\mathrm{X}}Y¯#ftv(X¯. T) (2). Let ρ ρ rho\rhoρ be a renaming that exchanges Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ with Z ¯ Z ¯ bar(Z)\overline{\mathrm{Z}}Z¯, where Z ¯ # ftv ( Y ¯ . [ X T ] T ) Z ¯ # ftv ( Y ¯ . [ X T ] T ) bar(Z)#ftv(AA bar(Y).[ vec(X)|-> vec(T)]T)\overline{\mathrm{Z}} \# \operatorname{ftv}(\forall \overline{\mathrm{Y}} .[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T})Z¯#ftv(Y¯.[XT]T) (3) and Z ¯ # ftv ( Γ ) Z ¯ # ftv ( Γ ) bar(Z)#ftv(Gamma)\overline{\mathrm{Z}} \# \operatorname{ftv}(\Gamma)Z¯#ftv(Γ) (4). Applying DM-InsT to (1) yields Γ t : [ X ρ T ] T Γ t : [ X ρ T ] T Gamma|--t:[ vec(X)|->rho vec(T)]T\Gamma \vdash \mathrm{t}:[\overrightarrow{\mathrm{X}} \mapsto \rho \overrightarrow{\mathrm{T}}] \mathrm{T}Γt:[XρT]T (5). Applying DM-GEN to (5) and (4) yields Γ Γ Gamma|--\Gamma \vdashΓ t : Z ¯ [ X ρ T ] T t : Z ¯ [ X ρ T ] T t:AA bar(Z)*[ vec(X)|->rho vec(T)]T\mathrm{t}: \forall \overline{\mathrm{Z}} \cdot[\overrightarrow{\mathrm{X}} \mapsto \rho \overrightarrow{\mathrm{T}}] \mathrm{T}t:Z¯[XρT]T, that is, Γ t : ρ Y ¯ . [ X ρ T ] T Γ t : ρ Y ¯ . [ X ρ T ] T Gamma|--t:AA rho bar(Y).[ vec(X)|->rho vec(T)]T\Gamma \vdash \mathrm{t}: \forall \rho \overline{\mathrm{Y}} .[\overrightarrow{\mathrm{X}} \mapsto \rho \overrightarrow{\mathrm{T}}] \mathrm{T}Γt:ρY¯.[XρT]T (6). Now, by (2) and (3), we have [ X ρ T ] T = ρ ( [ X T ] T ) [ X ρ T ] T = ρ ( [ X T ] T ) [ vec(X)|->rho vec(T)]T=rho([ vec(X)|-> vec(T)]T)[\overrightarrow{\mathrm{X}} \mapsto \rho \overrightarrow{\mathrm{T}}] \mathrm{T}=\rho([\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T})[XρT]T=ρ([XT]T), so (6) may be written Γ t : ρ Y ¯ . ρ ( [ X T ] T ) Γ t : ρ Y ¯ . ρ ( [ X T ] T ) Gamma|--t:AA rho bar(Y).rho([ vec(X)|-> vec(T)]T)\Gamma \vdash \mathrm{t}: \forall \rho \overline{\mathrm{Y}} . \rho([\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T})Γt:ρY¯.ρ([XT]T), that is, Γ t : ρ ( Y ¯ . [ X T ] T ) ( 7 ) Γ t : ρ ( Y ¯ . [ X T ] T ) ( 7 ) Gamma|--t:rho(AA bar(Y).[ vec(X)|-> vec(T)]T)(7)\Gamma \vdash \mathrm{t}: \rho(\forall \overline{\mathrm{Y}} .[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T})(7)Γt:ρ(Y¯.[XT]T)(7). By (3), this is exactly Γ t : Y [ X T ] T Γ t : Y [ X T ] T Gamma|--t:AA vec(Y)*[ vec(X)|-> vec(T)]T\Gamma \vdash \mathrm{t}: \forall \overrightarrow{\mathrm{Y}} \cdot[\overrightarrow{\mathrm{X}} \mapsto \overrightarrow{\mathrm{T}}] \mathrm{T}Γt:Y[XT]T. Thus, the effect of DM-INST' may be obtained by DM-INST and DM-GEN.
1.4.4 Solution: Let us recall that a program t t ttt is well-typed if and only if a judgement of the form C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ, where C C CCC is satisfiable, holds. Let us show that it is in fact possible, without loss of generality, to require σ σ sigma\sigmaσ to be a monotype.
Assume C , Γ t : σ ( 1 ) C , Γ t : σ ( 1 ) C,Gamma|--t:sigma(1)C, \Gamma \vdash \mathrm{t}: \sigma(\mathbf{1})C,Γt:σ(1) is derivable within HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). Let us write σ = σ = sigma=\sigma=σ= X ¯ [ D ] X ¯ [ D ] AA bar(X)[D]\forall \overline{\mathrm{X}}[D]X¯[D].T, where X ¯ # f t v ( C ) X ¯ # f t v ( C ) bar(X)#ftv(C)\overline{\mathrm{X}} \# f t v(C)X¯#ftv(C) (2). Applying Lemma 1.4.1 to (1) yields C C C⊩C \VdashC X ¯ . D X ¯ . D EE bar(X).D\exists \overline{\mathrm{X}} . DX¯.D (3). By hm-Inst, (1) implies C D , Γ t : T C D , Γ t : T C^^D,Gamma|--t:TC \wedge D, \Gamma \vdash \mathrm{t}: \mathrm{T}CD,Γt:T (4). By (3), we have C C x ¯ . D X ¯ . ( C D ) C C x ¯ . D X ¯ . ( C D ) C-=C^^EE bar(x).D-=EE bar(X).(C^^D)C \equiv C \wedge \exists \overline{\mathrm{x}} . D \equiv \exists \overline{\mathrm{X}} .(C \wedge D)CCx¯.DX¯.(CD). Because C C CCC is satisfiable, this implies that C D C D C^^DC \wedge DCD
is satisfiable as well. Thus, the judgement (4), which involves the monotype T T T\mathrm{T}T, witnesses that t t t\mathrm{t}t is well-typed.
We have shown that a program t t ttt is well-typed if and only if a judgement of the form C , Γ t : T C , Γ t : T C,Gamma|--t:TC, \Gamma \vdash \mathrm{t}: \mathrm{T}C,Γt:T, where C C CCC is satisfiable, holds. Thus, by Theorems ?? and ??, well-typedness is the same for both rule sets.
1.4.5 Solution: By Theorem ??, every rule in Figure 1-8 is admissible in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). Of course, so is HM-GEN. So, every judgement that is derivable via the rules of Figure 1-8 and H M G E N H M G E N HM-GEN\mathrm{HM}-\mathrm{GEN}HMGEN is a valid HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X) judgement.
Conversely, assume C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ (1) holds in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). We must show that it is derivable via the rules of Figure 1-8 and HM-GEN. Let us write σ = σ = sigma=\sigma=σ= x ¯ [ D ] x ¯ [ D ] AA bar(x)[D]\forall \overline{\mathrm{x}}[D]x¯[D].T, where x ¯ # f t v ( C , Γ ) x ¯ # f t v ( C , Γ ) bar(x)#ftv(C,Gamma)\overline{\mathrm{x}} \# \mathrm{ftv}(C, \Gamma)x¯#ftv(C,Γ) (2). By HM-Inst and (1), the judgement C C C^^C \wedgeC D , Γ t : T D , Γ t : T D,Gamma|--t:TD, \Gamma \vdash \mathrm{t}: \mathrm{T}D,Γt:T (3) holds in HM ( X ) HM ( X ) HM(X)\operatorname{HM}(X)HM(X). This judgement involves a monotype, so, by Theorem ??, it is derivable via the rules of Figure 1-8. Furthermore, from (3) and (2), HM-GEN allows deriving C σ , Γ t : σ C σ , Γ t : σ C^^EE sigma,Gamma|--t:sigmaC \wedge \exists \sigma, \Gamma \vdash \mathrm{t}: \sigmaCσ,Γt:σ (4). Applying Lemma 1.4.1 to (1) yields C σ C σ C⊩EE sigmaC \Vdash \exists \sigmaCσ, so the judgement (4) may be written C , Γ t : σ C , Γ t : σ C,Gamma|--t:sigmaC, \Gamma \vdash \mathrm{t}: \sigmaC,Γt:σ. We have shown that (1) is derivable via the rules of Figure 1-8 and HM-GEN. In fact, it is possible to apply HM-GEN only once, at the end of the derivation.
1.5.1 Solution: Within the type system PCB ( X ) PCB ( X ) PCB(X)\operatorname{PCB}(X)PCB(X), we have
The type variable z z zzz, which occurs free in the left-hand instance of VAR, is generalized. However, z 2 z 2 z_(2)\mathrm{z}_{2}z2 does not receive the type scheme Z . Z Z . Z AAZ.Z\forall \mathrm{Z} . \mathrm{Z}Z.Z, which, as suggested earlier, is unsound; instead, it receives the constrained type scheme z [ z 1 z ] . z z z 1 z . z AA z[z_(1)-<=z].z\forall z\left[z_{1} \preceq z\right] . zz[z1z].z. The latter is more restrictive than the former: indeed, the former claims that z 2 z 2 z_(2)z_{2}z2 has every type, while the latter only claims that every valid type for z 1 z 1 z_(1)z_{1}z1 is also a valid type for z 2 z 2 z_(2)z_{2}z2. Let us now examine the constraint let z 1 z 1 z_(1)z_{1}z1 : X ; z 2 : Z [ z 1 Z ] . Z X ; z 2 : Z z 1 Z . Z X;z_(2):AAZ[z_(1)-<=Z].Z\mathrm{X} ; \mathrm{z}_{2}: \forall \mathrm{Z}\left[\mathrm{z}_{1} \preceq \mathrm{Z}\right] . \mathrm{Z}X;z2:Z[z1Z].Z in z 2 Y z 2 Y z_(2)-<=Y\mathrm{z}_{2} \preceq \mathrm{Y}z2Y, which appears at the root of the derivation. By C-INID and C-IN*, it is equivalent to let z 1 : X z 1 : X z_(1):X\mathrm{z}_{1}: \mathrm{X}z1:X in Z . ( z 1 Z Z Y ) Z . z 1 Z Z Y EEZ.(z_(1)-<=Z^^Z <= Y)\exists \mathrm{Z} .\left(\mathrm{z}_{1} \preceq \mathrm{Z} \wedge \mathrm{Z} \leq \mathrm{Y}\right)Z.(z1ZZY) and to Z Z EEZ\exists \mathrm{Z}Z. X Z Z Y X Z Z Y X <= Z^^Z <= Y\mathrm{X} \leq \mathrm{Z} \wedge \mathrm{Z} \leq \mathrm{Y}XZZY ), which by C C C\mathrm{C}C-ExTRans is equivalent to X Y X Y X <= Y\mathrm{X} \leq \mathrm{Y}XY. Thus, the judgement at the root of the above derivation may be written X X X <=\mathrm{X} \leqX Y λ z 1 Y λ z 1 Y|--lambdaz_(1)\mathrm{Y} \vdash \lambda \mathrm{z}_{1}Yλz1. let z 2 = z 1 z 2 = z 1 z_(2)=z_(1)\mathrm{z}_{2}=\mathrm{z}_{1}z2=z1 in z 2 : X Y z 2 : X Y z_(2):XrarrY\mathrm{z}_{2}: \mathrm{X} \rightarrow \mathrm{Y}z2:XY. In other words, the expression let z 2 = z 2 = z_(2)=\mathrm{z}_{2}=z2= z 1 z 1 z_(1)\mathrm{z}_{1}z1 in z 2 z 2 z_(2)\mathrm{z}_{2}z2 has type X Y X Y XrarrY\mathrm{X} \rightarrow \mathrm{Y}XY only under the assumption that X X X\mathrm{X}X is a subtype of Y Y Y\mathrm{Y}Y, which is sound. Even though LET allows unrestricted generalization of type variables, it remains sound, because the type scheme that it produces typically has free program identifiers, such as Z [ z 1 Z ] . Z Z z 1 Z . Z AAZ[z_(1)-<=Z].Z\forall \mathrm{Z}\left[\mathbf{z}_{1} \preceq \mathrm{Z}\right] . \mathrm{Z}Z[z1Z].Z above.
1.7.10 Solution: Let E = E = E=\mathcal{E}=E= let z = E 1 z = E 1 z=E_(1)\mathrm{z}=\mathcal{E}_{1}z=E1 in t 1 t 1 t_(1)\mathrm{t}_{1}t1 and E 1 [ t ] / μ E 1 [ t ] / μ E 1 [ t ] / μ E 1 t / μ E_(1)[t]//mu⊑E_(1)[t^(')]//mu^(')\mathcal{E}_{1}[\mathrm{t}] / \mu \sqsubseteq \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}E1[t]/μE1[t]/μ (1). Then,
let Γ 0 ; ref M in [ [ E [ t ] / μ : T / M ] ] (2) = let Γ 0 ; ref M in ( ( let z : x [ [ [ E 1 [ t ] : x ] ] ] . X in [ [ t 1 : T ] ] ) [ [ μ : M ] ] ) (3) let Γ 0 ; ref M ; z : x [ [ [ E 1 [ t ] / μ : X / M ] ] ] . X in [ [ t 1 : T ] ] (4) let Γ 0 ; ref M ; z : x [ let Γ 0 ; ref M in [ [ E 1 [ t ] / μ : x / M ] ] ] . X in [ [ t 1 : T ] ] (5) let Γ 0 ; ref M ; z : x Y ¯ [ let Γ 0 ; ref M in [ [ E 1 [ t ] / μ : x / M ] ] ] . X in [ [ t 1 : T ] ]  let  Γ 0 ; ref M  in  [ [ E [ t ] / μ : T / M ] ] (2) =  let  Γ 0 ; ref M  in   let  z : x [ [ E 1 [ t ] : x ] ] . X  in  [ [ t 1 : T ] ] [ [ μ : M ] ] (3)  let  Γ 0 ; ref M ; z : x [ [ E 1 [ t ] / μ : X / M ] ] . X  in  [ [ t 1 : T ] ] (4)  let  Γ 0 ; ref M ; z : x  let  Γ 0 ; ref M  in  [ [ E 1 [ t ] / μ : x / M ] ] . X  in  [ [ t 1 : T ] ] (5)  let  Γ 0 ; ref M ; z : x Y ¯  let  Γ 0 ; ref M  in  [ [ E 1 t / μ : x / M ] ] . X  in  [ [ t 1 : T ] ] {:[" let "Gamma_(0);ref M" in "[[E[t]//mu:T//M]]],[(2)=" let "Gamma_(0);ref M" in "((" let "z:AAx[([[)E_(1)[t]:x(]])].X" in "([[)t_(1):T(]]))^^([[)mu:M(]]))],[(3)-=" let "Gamma_(0);ref M;z:AAx[([[)E_(1)[t]//mu:X//M(]])].X" in "[[t_(1):T]]],[(4)-=" let "Gamma_(0);ref M;z:AAx[" let "Gamma_(0);ref M" in "([[)E_(1)[t]//mu:x//M(]])].X" in "[[t_(1):T]]],[(5)⊩" let "Gamma_(0);ref M;z:AAx bar(Y)[" let "Gamma_(0);ref M^(')" in "([[)E_(1)[t^(')]//mu^('):x//M^(')(]])].X" in "[[t_(1):T]]]:}\begin{align*} & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket \mathcal{E}[\mathrm{t}] / \mu: \mathrm{T} / M \rrbracket \\ = & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in }\left(\left(\text { let } \mathrm{z}: \forall \mathrm{x}\left[\llbracket \mathcal{E}_{1}[\mathrm{t}]: \mathrm{x} \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket\right) \wedge \llbracket \mu: M \rrbracket\right) \tag{2}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M ; \mathrm{z}: \forall \mathrm{x}\left[\llbracket \mathcal{E}_{1}[\mathrm{t}] / \mu: \mathrm{X} / M \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket \tag{3}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M ; \mathrm{z}: \forall \mathrm{x}\left[\text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket \mathcal{E}_{1}[\mathrm{t}] / \mu: \mathrm{x} / M \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket \tag{4}\\ \Vdash & \text { let } \Gamma_{0} ; \operatorname{ref} M ; \mathrm{z}: \forall \mathrm{x} \overline{\mathrm{Y}}\left[\text { let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in } \llbracket \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}: \mathrm{x} / M^{\prime} \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket \tag{5} \end{align*} let Γ0;refM in [[E[t]/μ:T/M]](2)= let Γ0;refM in (( let z:x[[[E1[t]:x]]].X in [[t1:T]])[[μ:M]])(3) let Γ0;refM;z:x[[[E1[t]/μ:X/M]]].X in [[t1:T]](4) let Γ0;refM;z:x[ let Γ0;refM in [[E1[t]/μ:x/M]]].X in [[t1:T]](5) let Γ0;refM;z:xY¯[ let Γ0;refM in [[E1[t]/μ:x/M]]].X in [[t1:T]]
where (2) is by definition of constraint generation, where X ftv ( T , M ) ( 6 ) ; ( 3 ) X ftv ( T , M ) ( 6 ) ; ( 3 ) X!in ftv(T,M)(6);(3)\mathrm{X} \notin \operatorname{ftv}(\mathrm{T}, M)(\mathbf{6}) ;(3)Xftv(T,M)(6);(3) is by (6), C-LETAND, and by definition of constraint generation; (4) is by (6) and C-LetDup; (5) follows from (1) and C-LETEx, for some Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ and M M M^(')M^{\prime}M such that Y ¯ # ftv ( X , M ) ( 7 ) Y ¯ # ftv ( X , M ) ( 7 ) bar(Y)#ftv(X,M)(7)\overline{\mathrm{Y}} \# \operatorname{ftv}(\mathrm{X}, M)(\mathbf{7})Y¯#ftv(X,M)(7) and f t v ( M ) Y ¯ f t v ( M ) f t v M Y ¯ f t v ( M ) ftv(M^('))sube bar(Y)uu ftv(M)f t v\left(M^{\prime}\right) \subseteq \overline{\mathrm{Y}} \cup f t v(M)ftv(M)Y¯ftv(M) (8) and dom ( M ) = dom ( μ ) dom M = dom μ dom(M^('))=dom(mu^('))\operatorname{dom}\left(M^{\prime}\right)=\operatorname{dom}\left(\mu^{\prime}\right)dom(M)=dom(μ) and M M M^(')M^{\prime}M extends M M MMM. Note that (6), (7) and (8) imply X f t v ( M ) X f t v M X!in ftv(M^('))\mathrm{X} \notin f t v\left(M^{\prime}\right)Xftv(M) (9).
At this point, the type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯, which determine the types of the newly allocated store cells, are universally quantified in the type scheme assigned to z z zzz, which is undesirable. We are stuck, because we cannot in general apply C-LETALL to hoist Y ¯ Y ¯ EE bar(Y)\exists \bar{Y}Y¯ out of the let constraint. Let us now assume that, by some external means, we are guaranteed Y ¯ = ( 1 0 ) Y ¯ = ( 1 0 ) bar(Y)=O/(10)\overline{\mathrm{Y}}=\varnothing \mathbf{( 1 0 )}Y¯=(10). Then, we may proceed as follows:
(11) let Γ 0 ; ref M ; z : X [ let Γ 0 ; ref M in [ [ E 1 [ t ] / μ : X / M ] ] ] . X in [ [ t 1 : T ] ] (12) let Γ 0 ; ref M in [ [ E [ t ] / μ : T / M ] ] (11)  let  Γ 0 ; ref M ; z : X  let  Γ 0 ; ref M  in  [ [ E 1 t / μ : X / M ] ] . X  in  [ [ t 1 : T ] ] (12)  let  Γ 0 ; ref M  in  [ [ E t / μ : T / M ] ] {:[(11)-=" let "Gamma_(0);ref M^(');z:AAX[" let "Gamma_(0);ref M^(')" in "([[)E_(1)[t^(')]//mu^('):X//M^(')(]])].X" in "[[t_(1):T]]],[(12)-=" let "Gamma_(0);ref M^(')" in "[[E[t^(')]//mu^('):T//M^(')]]]:}\begin{align*} & \equiv \text { let } \Gamma_{0} ; \operatorname{ref} M^{\prime} ; \mathrm{z}: \forall \mathrm{X}\left[\text { let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in } \llbracket \mathcal{E}_{1}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}: \mathrm{X} / M^{\prime} \rrbracket\right] . \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket \tag{11}\\ & \equiv \text { let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in } \llbracket \mathcal{E}\left[\mathrm{t}^{\prime}\right] / \mu^{\prime}: \mathrm{T} / M^{\prime} \rrbracket \tag{12} \end{align*}(11) let Γ0;refM;z:X[ let Γ0;refM in [[E1[t]/μ:X/M]]].X in [[t1:T]](12) let Γ0;refM in [[E[t]/μ:T/M]]
where (11) follows from the fact the the memory locations that appear free in [ [ t 1 : T ] ] [ [ t 1 : T ] ] [[t_(1):T]]\llbracket \mathrm{t}_{1}: \mathrm{T} \rrbracket[[t1:T]] are members of dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ), thus are not members of dom ( M ) dom ( M ) dom M dom ( M ) dom(M^('))\\dom(M)\operatorname{dom}\left(M^{\prime}\right) \backslash \operatorname{dom}(M)dom(M)dom(M); (12) is obtained by performing the steps that lead to (4) in reverse.
The requirement that Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ be empty, that is, f t v ( M ) = f t v ( M ) f t v ( M ) = f t v M ftv(M)=ftv(M^('))f t v(M)=f t v\left(M^{\prime}\right)ftv(M)=ftv(M), is classic (Tofte, 1988). How is it enforced? Assume that the left-hand side of every let construct is required to be a non-expansive expression. By assumptions (ii) and (iii), this invariant is preserved by reduction. So, E 1 [ t ] E 1 [ t ] E_(1)[t]\mathcal{E}_{1}[t]E1[t] must be nonexpansive, which, by assumption (i), guarantees that the reduction step does not allocate new memory cells. Then, μ μ mu^(')\mu^{\prime}μ is μ μ mu\muμ, so M M M^(')M^{\prime}M is M M MMM.
1.9.1 Solution: We must first ensure that R-AdD respects \sqsubseteq (Definition 1.7.5). Since the rule is pure, it is sufficient to establish that let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ k ^ 1 + ^ k ^ 2 : T ] ] [ [ k ^ 1 + ^ k ^ 2 : T ] ] [[ hat(k)_(1) hat(+) hat(k)_(2):T]]\llbracket \hat{k}_{1} \hat{+} \hat{k}_{2}: \mathrm{T} \rrbracket[[k^1+^k^2:T]] entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ k 1 + k 2 ^ : T ] ] [ [ k 1 + k 2 ^ : T ] ] [[ widehat(k_(1)+k_(2)):T]]\llbracket \widehat{k_{1}+k_{2}}: \mathrm{T} \rrbracket[[k1+k2^:T]]. In fact, we have
let Γ 0 in [ [ k ^ 1 + ^ k ^ 2 : T ] ] (1) let Γ 0 in X Y . ( + ^ X Y T k ^ 1 X k ^ 2 Y ) (2) X Y . ( int int int X Y T int X int Y ) (3) X Y . ( X = int Y = int int T ) (4) int T (5) let Γ 0 in [ [ k 1 + k 2 ^ : T ] ]  let  Γ 0  in  [ [ k ^ 1 + ^ k ^ 2 : T ] ] (1)  let  Γ 0  in  X Y . + ^ X Y T k ^ 1 X k ^ 2 Y (2) X Y . (  int   int   int  X Y T  int  X  int  Y ) (3) X Y . ( X = int Y = int  int  T ) (4)  int  T (5)  let  Γ 0  in  [ [ k 1 + k 2 ^ : T ] ] {:[" let "Gamma_(0)" in "[[ hat(k)_(1) hat(+) hat(k)_(2):T]]],[(1)-=" let "Gamma_(0)" in "EEXY.(( hat(+))-<=XrarrYrarrT^^ hat(k)_(1)-<=X^^ hat(k)_(2)-<=Y)],[(2)-=EEXY.(" int "rarr" int "rarr" int " <= XrarrYrarrT^^" int " <= X^^" int " <= Y)],[(3)-=EEXY.(X=int^^Y=int^^" int " <= T)],[(4)-=" int " <= T],[(5)-=" let "Gamma_(0)" in "[[ widehat(k_(1)+k_(2)):T]]]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \llbracket \hat{k}_{1} \hat{+} \hat{k}_{2}: \mathrm{T} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} .\left(\hat{+} \preceq \mathrm{X} \rightarrow \mathrm{Y} \rightarrow \mathrm{T} \wedge \hat{k}_{1} \preceq \mathrm{X} \wedge \hat{k}_{2} \preceq \mathrm{Y}\right) \tag{1}\\ \equiv & \exists \mathrm{XY} .(\text { int } \rightarrow \text { int } \rightarrow \text { int } \leq \mathrm{X} \rightarrow \mathrm{Y} \rightarrow \mathrm{T} \wedge \text { int } \leq \mathrm{X} \wedge \text { int } \leq \mathrm{Y}) \tag{2}\\ \equiv & \exists \mathrm{XY} .(\mathrm{X}=\operatorname{int} \wedge \mathrm{Y}=\operatorname{int} \wedge \text { int } \leq \mathrm{T}) \tag{3}\\ \equiv & \text { int } \leq \mathrm{T} \tag{4}\\ \equiv & \text { let } \Gamma_{0} \text { in } \llbracket \widehat{k_{1}+k_{2}}: \mathrm{T} \rrbracket \tag{5} \end{align*} let Γ0 in [[k^1+^k^2:T]](1) let Γ0 in XY.(+^XYTk^1Xk^2Y)(2)XY.( int  int  int XYT int X int Y)(3)XY.(X=intY=int int T)(4) int T(5) let Γ0 in [[k1+k2^:T]]
where (1) is by definition of constraint generation; (2) is by definition of Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0, by C-INID and C-IN*; (3) is by C-ARRow and by antisymmetry of subtyping;
(4) is by C-ExAnd and C-NAmE; (5) is again by definition of Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0, by C-InId and C I N C I N C-IN^(**)\mathrm{C}-\mathrm{IN}^{*}CIN, and by definition of constraint generation.
Second, we must check that if the configuration c v 1 v k / μ c v 1 v k / μ cv_(1)dotsv_(k)//mu\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k} / \mucv1vk/μ (where k 0 k 0 k >= 0k \geq 0k0 ) is well-typed, then either it is reducible, or c v 1 v k c v 1 v k cv_(1)dotsv_(k)\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k}cv1vk is a value.
We begin by checking that every value that is well-typed with type int is of the form k ^ k ^ hat(k)\hat{k}k^. Indeed, suppose that let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in [ [ v : [ [ v : [[v:\llbracket \mathrm{v}:[[v: int ] ] ] ] ]]\rrbracket]] is satisfiable. Then, v v vvv cannot be a program variable, for a well-typed value must be closed. v cannot be a memory location m m mmm, for otherwise ref M ( m ) M ( m ) M(m) <=M(m) \leqM(m) int would be satisfiable - but the type constructors ref and int are incompatible. v v vvv cannot be + ^ + ^ hat(+)\hat{+}+^ or + ^ v + ^ v hat(+)v^(')\hat{+} \mathrm{v}^{\prime}+^v, for otherwise int rarr\rightarrow int rarr\rightarrow int <=\leq int or int rarr\rightarrow int <=\leq int would be satisfiable - but the type constructors rarr\rightarrow and int are incompatible. Similarly, v v v\mathrm{v}v cannot be a λ λ lambda\lambdaλ-abstraction. Thus, v v v\mathrm{v}v must be of the form k ^ k ^ hat(k)\hat{k}k^, for it is the only case left.
Next, we note that, according to the constraint generation rules, if the configuration c v 1 v k / μ c v 1 v k / μ cv_(1)dotsv_(k)//mu\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k} / \mucv1vk/μ is well-typed, then a constraint of the form let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M MMM in ( c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] ) c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] (c-<=x_(1)rarr dots rarrx_(k)rarrT^^([[)v_(1):x_(1)(]])^^dots^^([[)v_(k):x_(k)(]]))\left(\mathrm{c} \preceq \mathrm{x}_{1} \rightarrow \ldots \rightarrow \mathrm{x}_{k} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{v}_{1}: \mathrm{x}_{1} \rrbracket \wedge \ldots \wedge \llbracket \mathrm{v}_{k}: \mathrm{x}_{k} \rrbracket\right)(cx1xkT[[v1:x1]][[vk:xk]]) is satisfiable. We now reason by cases on c c ccc.
  • Case c c c\mathrm{c}c is k ^ k ^ hat(k)\hat{k}k^. Then, Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c) is int. Because the type constructors int and rarr\rightarrow are incompatible with each other, this implies k = 0 k = 0 k=0k=0k=0. Since k ^ k ^ hat(k)\hat{k}k^ is a constructor, the expression is a value.
  • Case c c c\mathrm{c}c is + ^ + ^ hat(+)\hat{+}+^. We may assume k 2 k 2 k >= 2k \geq 2k2, because otherwise the expression is a value. Then, Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c) is int rarr\rightarrow int rarr\rightarrow int, so, by C-Arrow, the above constraint entails let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in ( x 1 int X 2 int [ [ v 1 : X 1 ] ] [ [ v 2 : X 2 ] ] ) x 1 int X 2 int [ [ v 1 : X 1 ] ] [ [ v 2 : X 2 ] ] (x_(1) <= int^^X_(2) <= int^^([[)v_(1):X_(1)(]])^^([[)v_(2):X_(2)(]]))\left(\mathrm{x}_{1} \leq \operatorname{int} \wedge \mathrm{X}_{2} \leq \operatorname{int} \wedge \llbracket \mathrm{v}_{1}: \mathrm{X}_{1} \rrbracket \wedge \llbracket \mathrm{v}_{2}: \mathrm{X}_{2} \rrbracket\right)(x1intX2int[[v1:X1]][[v2:X2]]), which, by Lemma 1.6.3, entails let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in ( [ [ v 1 : int ] ] [ [ v 2 [ [ v 1 : int ] ] [ [ v 2 (([[)v_(1):int (]])^^([[)v_(2):}\left(\llbracket \mathrm{v}_{1}: \operatorname{int} \rrbracket \wedge \llbracket \mathrm{v}_{2}\right.([[v1:int]][[v2 : int ] ] ) ] ] {:(]]))\left.\rrbracket\right)]]). Thus, v 1 v 1 v_(1)\mathrm{v}_{1}v1 and v 2 v 2 v_(2)\mathrm{v}_{2}v2 are well-typed with type int. By the remark above, they must be integer literals k ^ 1 k ^ 1 hat(k)_(1)\hat{k}_{1}k^1 and k ^ 2 k ^ 2 hat(k)_(2)\hat{k}_{2}k^2. As a result, the configuration is reducible by R-ADD.
1.9.5 Solution: We must first ensure that R-Ref, R-Deref and R-Assign respect ( ( ⊑(\sqsubseteq(( Definition 1.7.5).
  • Case R-REF. The reduction is ref v/ m / ( m v ) m / ( m v ) O/ longrightarrow m//(m|->v)\varnothing \longrightarrow m /(m \mapsto \mathrm{v})m/(mv), where m m m!inm \notinm fpi(v) (1). Let T T T\mathrm{T}T be an arbitrary type. According to Definition 1.7.5, the goal is to show that there exist a set of type variables Y ¯ Y ¯ bar(Y)\overline{\mathrm{Y}}Y¯ and a store type M M M^(')M^{\prime}M such that Y ¯ # ftv ( T ) Y ¯ # ftv ( T ) bar(Y)#ftv(T)\overline{\mathrm{Y}} \# \operatorname{ftv}(\mathrm{T})Y¯#ftv(T) and f t v ( M ) Y ¯ f t v M Y ¯ ftv(M^('))sube bar(Y)f t v\left(M^{\prime}\right) \subseteq \overline{\mathrm{Y}}ftv(M)Y¯ and dom ( M ) = { m } dom M = { m } dom(M^('))={m}\operatorname{dom}\left(M^{\prime}\right)=\{m\}dom(M)={m} and let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ [ [ [[\llbracket[[ ref v : T ] ] v : T ] ] v:T]]\mathrm{v}: \mathrm{T} \rrbracketv:T]]
    entails Y ¯ Y ¯ EE bar(Y)\exists \overline{\mathrm{Y}}Y¯.let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M M^(')M^{\prime}M in [ [ m / ( m v ) : T / M ] ] [ [ m / ( m v ) : T / M ] ] [[m//(m|->v):T//M^(')]]\llbracket m /(m \mapsto \mathrm{v}): \mathrm{T} / M^{\prime} \rrbracket[[m/(mv):T/M]]. Now, we have
let Γ 0 in [ [ ref v : T ] ] (2) let Γ 0 in X Y . ( Y ref Y X T [ [ v : X ] ] ) (3) Y . let Γ 0 in ( ref Y T [ [ v : Y ] ] ) (4) Y . let Γ 0 ; ref M in ( m T [ [ v : M ( m ) ] ] ) (5) Y . let Γ 0 ; ref M in [ [ m / ( m v ) : T / M ] ]  let  Γ 0  in  [ [ ref v : T ] ] (2)  let  Γ 0  in  X Y . ( Y ref Y X T [ [ v : X ] ] ) (3) Y .  let  Γ 0  in  ( ref Y T [ [ v : Y ] ] ) (4) Y .  let  Γ 0 ; ref M  in  m T [ [ v : M ( m ) ] ] (5) Y .  let  Γ 0 ; ref M  in  [ [ m / ( m v ) : T / M ] ] {:[" let "Gamma_(0)" in "[[ref v:T]]],[(2)-=" let "Gamma_(0)" in "EEXY.(Yrarr ref Y <= XrarrT^^[[v:X]])],[(3)-=EEY." let "Gamma_(0)" in "(ref Y <= T^^[[v:Y]])],[(4)-=EEY." let "Gamma_(0);ref M^(')" in "(m-<=T^^([[)v:M^(')(m)(]]))],[(5)-=EEY." let "Gamma_(0);ref M^(')" in "[[m//(m|->v):T//M^(')]]]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \llbracket \operatorname{ref} \mathrm{v}: \mathrm{T} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} .(\mathrm{Y} \rightarrow \operatorname{ref} \mathrm{Y} \leq \mathrm{X} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{v}: \mathrm{X} \rrbracket) \tag{2}\\ \equiv & \exists \mathrm{Y} . \text { let } \Gamma_{0} \text { in }(\operatorname{ref} \mathrm{Y} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: \mathrm{Y} \rrbracket) \tag{3}\\ \equiv & \exists \mathrm{Y} . \text { let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in }\left(m \preceq \mathrm{T} \wedge \llbracket \mathrm{v}: M^{\prime}(m) \rrbracket\right) \tag{4}\\ \equiv & \exists \mathrm{Y} . \text { let } \Gamma_{0} ; \operatorname{ref} M^{\prime} \text { in } \llbracket m /(m \mapsto \mathrm{v}): \mathrm{T} / M^{\prime} \rrbracket \tag{5} \end{align*} let Γ0 in [[refv:T]](2) let Γ0 in XY.(YrefYXT[[v:X]])(3)Y. let Γ0 in (refYT[[v:Y]])(4)Y. let Γ0;refM in (mT[[v:M(m)]])(5)Y. let Γ0;refM in [[m/(mv):T/M]]
where (2) is by definition of constraint generation and by definition of Γ 0 ( r e f ) Γ 0 ( r e f ) Gamma_(0)(ref)\Gamma_{0}(\mathrm{ref})Γ0(ref); (3) is by C-Arrow, Lemma 1.6.4, and C-InEx; (4) assumes M M M^(')M^{\prime}M is defined as m Y m Y m|->Ym \mapsto \mathrm{Y}mY, and follows from (1), C-INID and C-IN*; and (5) is by definition of constraint generation.
Subcase R-DEREF. The reduction is ! m / ( m v ) v / ( m v ) ! m / ( m v ) v / ( m v ) !m//(m|->v)longrightarrowv//(m|->v)! m /(m \mapsto \mathrm{v}) \longrightarrow \mathrm{v} /(m \mapsto \mathrm{v})!m/(mv)v/(mv). Let T T T\mathrm{T}T be an arbitrary type and let M M MMM be a store type of domain { m } { m } {m}\{m\}{m}. We have
let Γ 0 ; ref M in [ [ ! m / ( m v ) : T / M ] ] (1) let Γ 0 ; ref M in X Y . ( ref Y Y X T m X [ [ v : M ( m ) ] ] ) (2) let Γ 0 ; ref M in X Y . ( ref M ( m ) X ref Y Y T [ [ v : M ( m ) ] ] ) (3) let Γ 0 ; ref M in Y . ( M ( m ) = Y Y T [ [ v : M ( m ) ] ] ) (4) let Γ 0 ; ref M in ( M ( m ) T [ [ v : M ( m ) ] ] ) (5) let Γ 0 ; ref M in ( [ [ v : T ] ] [ [ v : M ( m ) ] ] ) (6) let Γ 0 ; ref M in [ [ v / ( m v ) : T / M ] ]  let  Γ 0 ; ref M  in  [ [ ! m / ( m v ) : T / M ] ] (1)  let  Γ 0 ; ref M  in  X Y . ( ref Y Y X T m X [ [ v : M ( m ) ] ] ) (2)  let  Γ 0 ; ref M  in  X Y . ( ref M ( m ) X ref Y Y T [ [ v : M ( m ) ] ] ) (3)  let  Γ 0 ; ref M  in  Y . ( M ( m ) = Y Y T [ [ v : M ( m ) ] ] ) (4)  let  Γ 0 ; ref M  in  ( M ( m ) T [ [ v : M ( m ) ] ] ) (5)  let  Γ 0 ; ref M  in  ( [ [ v : T ] ] [ [ v : M ( m ) ] ] ) (6)  let  Γ 0 ; ref M  in  [ [ v / ( m v ) : T / M ] ] {:[" let "Gamma_(0);ref M" in "[[!m//(m|->v):T//M]]],[(1)-=" let "Gamma_(0);ref M" in "EEXY.(ref YrarrY <= XrarrT^^m-<=X^^[[v:M(m)]])],[(2)-=" let "Gamma_(0);ref M" in "EEXY.(ref M(m) <= X <= ref Y^^Y <= T^^[[v:M(m)]])],[(3)-=" let "Gamma_(0);ref M" in "EEY.(M(m)=Y^^Y <= T^^[[v:M(m)]])],[(4)-=" let "Gamma_(0);ref M" in "(M(m) <= T^^[[v:M(m)]])],[(5)⊩" let "Gamma_(0);ref M" in "([[v:T]]^^[[v:M(m)]])],[(6)-=" let "Gamma_(0);ref M" in "[[v//(m|->v):T//M]]]:}\begin{align*} & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket ! m /(m \mapsto \mathrm{v}): \mathrm{T} / M \rrbracket \\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \exists \mathrm{XY} .(\operatorname{ref} \mathrm{Y} \rightarrow \mathrm{Y} \leq \mathrm{X} \rightarrow \mathrm{T} \wedge m \preceq \mathrm{X} \wedge \llbracket \mathrm{v}: M(m) \rrbracket) \tag{1}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \exists \mathrm{XY} .(\operatorname{ref} M(m) \leq \mathrm{X} \leq \operatorname{ref} \mathrm{Y} \wedge \mathrm{Y} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: M(m) \rrbracket) \tag{2}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \exists \mathrm{Y} .(M(m)=\mathrm{Y} \wedge \mathrm{Y} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: M(m) \rrbracket) \tag{3}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in }(M(m) \leq \mathrm{T} \wedge \llbracket \mathrm{v}: M(m) \rrbracket) \tag{4}\\ \Vdash & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in }(\llbracket \mathrm{v}: \mathrm{T} \rrbracket \wedge \llbracket \mathrm{v}: M(m) \rrbracket) \tag{5}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket \mathrm{v} /(m \mapsto \mathrm{v}): \mathrm{T} / M \rrbracket \tag{6} \end{align*} let Γ0;refM in [[!m/(mv):T/M]](1) let Γ0;refM in XY.(refYYXTmX[[v:M(m)]])(2) let Γ0;refM in XY.(refM(m)XrefYYT[[v:M(m)]])(3) let Γ0;refM in Y.(M(m)=YYT[[v:M(m)]])(4) let Γ0;refM in (M(m)T[[v:M(m)]])(5) let Γ0;refM in ([[v:T]][[v:M(m)]])(6) let Γ0;refM in [[v/(mv):T/M]]
where (1) is by definition of constraint generation and by definition of Γ 0 ( ! ) Γ 0 ( ! ) Gamma_(0)(!)\Gamma_{0}(!)Γ0(!); (2) is by C-Arrow and C-InId; (3) follows from C-ExTrans and from the fact that ref is an invariant type constructor; (4) is by C-NAMEEQ; (5) is by Lemma 1.6.3 and C-DuP; and (6) is again by definition of constraint generation.
@\circ Case R-Assign. The reduction is m := v / ( m v 0 ) v / ( m v ) m := v / m v 0 v / ( m v ) m:=v//(m|->v_(0))longrightarrowv//(m|->v)m:=\mathrm{v} /\left(m \mapsto \mathrm{v}_{0}\right) \longrightarrow \mathrm{v} /(m \mapsto \mathrm{v})m:=v/(mv0)v/(mv). Let T T T\mathrm{T}T be an arbitrary type and let M M MMM be a store type of domain { m } { m } {m}\{m\}{m}. We have
let Γ 0 ; ref M in [ [ m := v / ( m v 0 ) : T / M ] ] (1) let Γ 0 ; ref M in [ [ m := v : T ] ] (2) let Γ 0 ; ref M in X Y Z . ( ref Z Z Z X Y T m X [ [ v : Y ] ] ) (3) let Γ 0 ; ref M in X Y Z . ( ref M ( m ) X ref Z Z T [ [ v : Y ] ] Y Z ) (4) let Γ 0 ; ref M in Z . ( M ( m ) = Z Z T [ [ v : Z ] ] ) (5) let Γ 0 ; ref M in ( M ( m ) T [ [ v : M ( m ) ] ] ) (6) let Γ 0 ; ref M in [ [ v / ( m v ) : T / M ] ]  let  Γ 0 ; ref M  in  [ [ m := v / m v 0 : T / M ] ] (1)  let  Γ 0 ; ref M  in  [ [ m := v : T ] ] (2)  let  Γ 0 ; ref M  in  X Y Z . ( ref Z Z Z X Y T m X [ [ v : Y ] ] ) (3)  let  Γ 0 ; ref M  in  X Y Z . ( ref M ( m ) X ref Z Z T [ [ v : Y ] ] Y Z ) (4)  let  Γ 0 ; ref M  in  Z . ( M ( m ) = Z Z T [ [ v : Z ] ] ) (5)  let  Γ 0 ; ref M  in  ( M ( m ) T [ [ v : M ( m ) ] ] ) (6)  let  Γ 0 ; ref M  in  [ [ v / ( m v ) : T / M ] ] {:[" let "Gamma_(0);ref M" in "[[m:=v//(m|->v_(0)):T//M]]],[(1)⊩" let "Gamma_(0);ref M" in "[[m:=v:T]]],[(2)-=" let "Gamma_(0);ref M" in "EEXYZ.(ref ZrarrZrarrZ <= XrarrYrarrT^^m-<=X^^[[v:Y]])],[(3)-=" let "Gamma_(0);ref M" in "EEXYZ.(ref M(m) <= X <= ref Z^^Z <= T^^[[v:Y]]^^Y <= Z)],[(4)-=" let "Gamma_(0);ref M" in "EEZ.(M(m)=Z^^Z <= T^^[[v:Z]])],[(5)-=" let "Gamma_(0);ref M" in "(M(m) <= T^^[[v:M(m)]])],[(6)⊩" let "Gamma_(0);ref M" in "[[v//(m|->v):T//M]]]:}\begin{align*} & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket m:=\mathrm{v} /\left(m \mapsto \mathrm{v}_{0}\right): \mathrm{T} / M \rrbracket \\ \Vdash & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket m:=\mathrm{v}: \mathrm{T} \rrbracket \tag{1}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \exists \mathrm{XYZ} .(\operatorname{ref} \mathrm{Z} \rightarrow \mathrm{Z} \rightarrow \mathrm{Z} \leq \mathrm{X} \rightarrow \mathrm{Y} \rightarrow \mathrm{T} \wedge m \preceq \mathrm{X} \wedge \llbracket \mathrm{v}: \mathrm{Y} \rrbracket) \tag{2}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \exists \mathrm{XYZ} .(\operatorname{ref} M(m) \leq \mathrm{X} \leq \operatorname{ref} \mathrm{Z} \wedge \mathrm{Z} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: \mathrm{Y} \rrbracket \wedge \mathrm{Y} \leq \mathrm{Z}) \tag{3}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \exists \mathrm{Z} .(M(m)=\mathrm{Z} \wedge \mathrm{Z} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: \mathrm{Z} \rrbracket) \tag{4}\\ \equiv & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in }(M(m) \leq \mathrm{T} \wedge \llbracket \mathrm{v}: M(m) \rrbracket) \tag{5}\\ \Vdash & \text { let } \Gamma_{0} ; \operatorname{ref} M \text { in } \llbracket \mathrm{v} /(m \mapsto \mathrm{v}): \mathrm{T} / M \rrbracket \tag{6} \end{align*} let Γ0;refM in [[m:=v/(mv0):T/M]](1) let Γ0;refM in [[m:=v:T]](2) let Γ0;refM in XYZ.(refZZZXYTmX[[v:Y]])(3) let Γ0;refM in XYZ.(refM(m)XrefZZT[[v:Y]]YZ)(4) let Γ0;refM in Z.(M(m)=ZZT[[v:Z]])(5) let Γ0;refM in (M(m)T[[v:M(m)]])(6) let Γ0;refM in [[v/(mv):T/M]]
where (1) and (2) are by definition of constraint generation; (3) is by CArrow and C-InId; (4) is by C-ExTrans, Lemma 1.6.4, and from the fact that ref is an invariant type constructor; (5) is by C-NAMEEQ; and (6) is obtained as in the previous case.
Second, we must check that if the configuration c v 1 v k / μ c v 1 v k / μ cv_(1)dotsv_(k)//mu\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k} / \mucv1vk/μ (where k 0 k 0 k >= 0k \geq 0k0 ) is well-typed, then either it is reducible, or c v 1 v k c v 1 v k cv_(1)dotsv_(k)\mathrm{c} \mathrm{v}_{1} \ldots \mathrm{v}_{k}cv1vk is a value. We only give a sketch of this proof; see the solution to Exercise 1.9.1 for details of a similar proof.
We begin by checking that every value that is well-typed with a type of the form ref T T T\mathrm{T}T is a memory location. This assertion relies on the fact that the type constructor ref is isolated.
Next, we note that, according to the constraint generation rules, if the configuration c 1 v k / μ c 1 v k / μ c_(1)dotsv_(k)//mu\mathrm{c}_{1} \ldots \mathrm{v}_{k} / \muc1vk/μ is well-typed, then a constraint of the form let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0; ref M M MMM in ( c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] ) c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] (c-<=x_(1)rarr dots rarrx_(k)rarrT^^([[)v_(1):x_(1)(]])^^dots^^([[)v_(k):x_(k)(]]))\left(\mathrm{c} \preceq \mathrm{x}_{1} \rightarrow \ldots \rightarrow \mathrm{x}_{k} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{v}_{1}: \mathrm{x}_{1} \rrbracket \wedge \ldots \wedge \llbracket \mathrm{v}_{k}: \mathrm{x}_{k} \rrbracket\right)(cx1xkT[[v1:x1]][[vk:xk]]) is satisfiable. We now reason by cases on c c c\mathrm{c}c.
  • Case c c c\mathrm{c}c is ref. If k = 0 k = 0 k=0k=0k=0, then the expression is a value; otherwise, it is reducible by R R R\mathrm{R}R-REF.
  • Case c is !. We may assume k 1 k 1 k >= 1k \geq 1k1, because otherwise the expression is a value. Then, by definition of Γ 0 ( ! ) Γ 0 ( ! ) Gamma_(0)(!)\Gamma_{0}(!)Γ0(!), the above constraint entails let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in Y Y EEY\exists \mathrm{Y}Y.(ref Y Y X 1 X k T [ [ v 1 : X 1 ] ] ) Y Y X 1 X k T [ [ v 1 : X 1 ] ] {:YrarrY <= X_(1)rarr dots rarrX_(k)rarrT^^([[)v_(1):X_(1)(]]))\left.\mathrm{Y} \rightarrow \mathrm{Y} \leq \mathrm{X}_{1} \rightarrow \ldots \rightarrow \mathrm{X}_{k} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{v}_{1}: \mathrm{X}_{1} \rrbracket\right)YYX1XkT[[v1:X1]]), which, by C-Arrow, Lemma 1.6.3, and C-INEx, entails Y Y EEY\exists \mathrm{Y}Y.let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in [ [ v 1 : ref Y ] ] [ [ v 1 : ref Y ] ] [[v_(1):ref Y]]\llbracket \mathrm{v}_{1}: \operatorname{ref} \mathrm{Y} \rrbracket[[v1:refY]]. Thus, v 1 v 1 v_(1)\mathrm{v}_{1}v1 is well-typed with a type of the form ref Y Y Y\mathrm{Y}Y. By the remark above, v 1 v 1 v_(1)\mathrm{v}_{1}v1 must be a memory location m m mmm. Furthermore, because every well-typed configuration is closed, m m mmm must be a member of dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ). As a result, the configuration ref v 1 v k / μ v 1 v k / μ v_(1)dotsv_(k)//mu\mathrm{v}_{1} \ldots \mathrm{v}_{k} / \muv1vk/μ is reducible by R-DEREF.
  • Case c c c\mathrm{c}c is := := :=:=:=. We may assume k 2 k 2 k >= 2k \geq 2k2, because otherwise the expression is a value. As above, we check that v 1 v 1 v_(1)v_{1}v1 must be a memory location and a member of dom ( μ ) dom ( μ ) dom(mu)\operatorname{dom}(\mu)dom(μ). Thus, the configuration is reducible by R-Assign.
1.9.6 Solution: We must first ensure that R-Fix respects \sqsubseteq (Definition 1.7.5). Since the rule is pure, it is sufficient to establish that let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ f i x v 1 v 2 : T ] ] [ [ f i x v 1 v 2 : T ] ] [[fixv_(1)v_(2):T]]\llbracket f i x v_{1} v_{2}: T \rrbracket[[fixv1v2:T]] entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ v 1 ( f i x v 1 ) v 2 : T ] ] [ [ v 1 f i x v 1 v 2 : T ] ] [[v_(1)(fixv_(1))v_(2):T]]\llbracket \mathrm{v}_{1}\left(f i x v_{1}\right) \mathrm{v}_{2}: \mathrm{T} \rrbracket[[v1(fixv1)v2:T]]. Let C C CCC stand for the constraint fix -<=\preceq ( ( X Y ) ( X Y ) ) X Y Y T [ [ v 1 : ( X Y ) ( X Y ) ] ] [ [ v 2 : X ] ] ( ( X Y ) ( X Y ) ) X Y Y T [ [ v 1 : ( X Y ) ( X Y ) ] ] [ [ v 2 : X ] ] ((XrarrY)rarr(XrarrY))rarrXrarrY^^Y <= T^^[[v_(1):(XrarrY)rarr(XrarrY)]]^^[[v_(2):X]]((\mathrm{X} \rightarrow \mathrm{Y}) \rightarrow(\mathrm{X} \rightarrow \mathrm{Y})) \rightarrow \mathrm{X} \rightarrow \mathrm{Y} \wedge \mathrm{Y} \leq \mathrm{T} \wedge \llbracket \mathrm{v}_{1}:(\mathrm{X} \rightarrow \mathrm{Y}) \rightarrow(\mathrm{X} \rightarrow \mathrm{Y}) \rrbracket \wedge \llbracket \mathrm{v}_{2}: \mathrm{X} \rrbracket((XY)(XY))XYYT[[v1:(XY)(XY)]][[v2:X]]. We have
let Γ 0 in [ [ f i x v 1 v 2 : T ] ] (1) let Γ 0 in X 1 X 2 . ( f i x X 1 X 2 T [ [ v 1 : X 1 ] ] [ [ v 2 : X 2 ] ] ) let Γ 0 in X 1 X 2 X Y . ( ( ( X Y ) ( X Y ) ) X Y X 1 X 2 T (2) [ [ v 1 : X 1 ] ] [ [ v 2 : X 2 ] ] ) (3) let Γ 0 in X Y . ( Y T [ [ v 1 : ( X Y ) ( X Y ) ] ] [ [ v 2 : X ] ] ) (4) let Γ 0 in X Y . C  let  Γ 0  in  [ [ f i x v 1 v 2 : T ] ] (1)  let  Γ 0  in  X 1 X 2 . f i x X 1 X 2 T [ [ v 1 : X 1 ] ] [ [ v 2 : X 2 ] ]  let  Γ 0  in  X 1 X 2 X Y . ( ( X Y ) ( X Y ) ) X Y X 1 X 2 T (2) [ [ v 1 : X 1 ] ] [ [ v 2 : X 2 ] ] (3)  let  Γ 0  in  X Y . Y T [ [ v 1 : ( X Y ) ( X Y ) ] ] [ [ v 2 : X ] ] (4)  let  Γ 0  in  X Y . C {:[" let "Gamma_(0)" in "[[fixv_(1)v_(2):T]]],[(1)-=" let "Gamma_(0)" in "EEX_(1)X_(2).(fix-<=X_(1)rarrX_(2)rarrT^^([[)v_(1):X_(1)(]])^^([[)v_(2):X_(2)(]]))],[-=" let "Gamma_(0)" in "EEX_(1)X_(2)XY.(((XrarrY)rarr(XrarrY))rarrXrarrY <= X_(1)rarrX_(2)rarrT:}],[(2){:^^([[)v_(1):X_(1)(]])^^([[)v_(2):X_(2)(]]))],[(3)-=" let "Gamma_(0)" in "EEXY.(Y <= T^^([[)v_(1):(XrarrY)rarr(XrarrY)(]])^^([[)v_(2):X(]]))],[(4)-=" let "Gamma_(0)" in "EEXY.C]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \llbracket \mathrm{fix} \mathrm{v}_{1} \mathrm{v}_{2}: \mathrm{T} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{X}_{1} \mathrm{X}_{2} .\left(\mathrm{fix} \preceq \mathrm{X}_{1} \rightarrow \mathrm{X}_{2} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{v}_{1}: \mathrm{X}_{1} \rrbracket \wedge \llbracket \mathrm{v}_{2}: \mathrm{X}_{2} \rrbracket\right) \tag{1}\\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{X}_{1} \mathrm{X}_{2} \mathrm{XY} .\left(((\mathrm{X} \rightarrow \mathrm{Y}) \rightarrow(\mathrm{X} \rightarrow \mathrm{Y})) \rightarrow \mathrm{X} \rightarrow \mathrm{Y} \leq \mathrm{X}_{1} \rightarrow \mathrm{X}_{2} \rightarrow \mathrm{T}\right. \\ & \left.\wedge \llbracket \mathrm{v}_{1}: \mathrm{X}_{1} \rrbracket \wedge \llbracket \mathrm{v}_{2}: \mathrm{X}_{2} \rrbracket\right) \tag{2}\\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} .\left(\mathrm{Y} \leq \mathrm{T} \wedge \llbracket \mathrm{v}_{1}:(\mathrm{X} \rightarrow \mathrm{Y}) \rightarrow(\mathrm{X} \rightarrow \mathrm{Y}) \rrbracket \wedge \llbracket \mathrm{v}_{2}: \mathrm{X} \rrbracket\right) \tag{3}\\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} . C \tag{4} \end{align*} let Γ0 in [[fixv1v2:T]](1) let Γ0 in X1X2.(fixX1X2T[[v1:X1]][[v2:X2]]) let Γ0 in X1X2XY.(((XY)(XY))XYX1X2T(2)[[v1:X1]][[v2:X2]])(3) let Γ0 in XY.(YT[[v1:(XY)(XY)]][[v2:X]])(4) let Γ0 in XY.C
where (1) is by definition of constraint generation; (2) is by definition of Γ 0 ( f i x ) ; ( 3 ) Γ 0 ( f i x ) ; ( 3 ) Gamma_(0)(fix);(3)\Gamma_{0}(f i x) ;(3)Γ0(fix);(3) is by C-ARrow and Lemma 1.6.4;(4) is by definition of Γ 0 ( f i x ) Γ 0 ( f i x ) Gamma_(0)(fix)\Gamma_{0}(f i x)Γ0(fix). By Theorem 1.6.2 and WEAKEn, the judgements C v 1 : ( X Y ) C v 1 : ( X Y ) C|--v_(1):(XrarrY)rarrC \vdash \mathrm{v}_{1}:(\mathrm{X} \rightarrow \mathrm{Y}) \rightarrowCv1:(XY)
( X Y ) ( X Y ) (XrarrY)(\mathrm{X} \rightarrow \mathrm{Y})(XY) and C v 2 : X C v 2 : X C|--v_(2):XC \vdash \mathrm{v}_{2}: \mathrm{X}Cv2:X hold. By Var, Weaken, App, and Sub, it follows that C v 1 ( f i x v 1 ) v 2 : T C v 1 f i x v 1 v 2 : T C|--v_(1)(fixv_(1))v_(2):TC \vdash \mathrm{v}_{1}\left(f i x v_{1}\right) v_{2}: \mathrm{T}Cv1(fixv1)v2:T holds. By Theorem 1.6.6, this implies C [ [ v 1 ( f i x v 1 ) v 2 : T ] ] C [ [ v 1 f i x v 1 v 2 : T ] ] C⊩[[v_(1)(fixv_(1))v_(2):T]]C \Vdash \llbracket \mathrm{v}_{1}\left(f i x \mathrm{v}_{1}\right) \mathrm{v}_{2}: \mathrm{T} \rrbracketC[[v1(fixv1)v2:T]]. By congruence of entailment and by C-Ex*, (4) entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ v 1 ( f i x v 1 ) v 2 : T ] ] [ [ v 1 f i x v 1 v 2 : T ] ] [[v_(1)(fixv_(1))v_(2):T]]\llbracket v_{1}\left(f i x v_{1}\right) v_{2}: T \rrbracket[[v1(fixv1)v2:T]].
Second, we must check that if the configuration f i x v 1 v k / μ f i x v 1 v k / μ fixv_(1)dotsv_(k)//muf i x v_{1} \ldots v_{k} / \mufixv1vk/μ (where k 0 k 0 k >= 0k \geq 0k0 ) is well-typed, then either it is reducible, or f i x v 1 v k f i x v 1 v k fixv_(1)dotsv_(k)f i x v_{1} \ldots v_{k}fixv1vk is a value. This is immediate, for it is a value when k < 2 k < 2 k < 2k<2k<2, and it is reducible by R-FIX when k 2 k 2 k >= 2k \geq 2k2.
We now recall that the construct letrec f = λ f = λ f=lambdaf=\lambdaf=λ z.t 1 1 _(1)_{1}1 in t 2 t 2 t_(2)t_{2}t2 provided by ML-the-programming-language may be viewed as syntactic sugar for let f = f = f=\mathrm{f}=f= fix ( λ f . λ z . t 1 ) λ f . λ z . t 1 (lambda f.lambda z.t_(1))\left(\lambda f . \lambda z . t_{1}\right)(λf.λz.t1) in t 2 t 2 t_(2)t_{2}t2, and set forth to discover the constraint generation rule that arises out of such a definition. We have
let Γ 0 in [ [ f i x ( λ f . λ z . t 1 ) : T ] ] (1) let Γ 0 in Z . ( f i x Z T [ [ λ f . λ z t 1 : Z ] ] ) (2) let Γ 0 in X Y . ( X Y T [ [ λ f f . λ z . t 1 : ( X Y ) ( X Y ) ] ] ) (3) let Γ 0 in X Y . ( X Y T let f : X Y ; z : X in [ [ t 1 : Y ] ] )  let  Γ 0  in  [ [ f i x λ f . λ z . t 1 : T ] ] (1)  let  Γ 0  in  Z . f i x Z T [ [ λ f . λ z t 1 : Z ] ] (2)  let  Γ 0  in  X Y . X Y T [ [ λ f f . λ z . t 1 : ( X Y ) ( X Y ) ] ] (3)  let  Γ 0  in  X Y . X Y T  let  f : X Y ; z : X  in  [ [ t 1 : Y ] ] {:[" let "Gamma_(0)" in "[[fix(lambdaf.lambdaz.t_(1)):T]]],[(1)-=" let "Gamma_(0)" in "EEZ.(fix-<=ZrarrT^^([[)lambdaf.lambdaz*t_(1):Z(]]))],[(2)-=" let "Gamma_(0)" in "EEXY.(XrarrY <= T^^([[)lambda ff.lambdaz.t_(1):(XrarrY)rarr(XrarrY)(]]))],[(3)-=" let "Gamma_(0)" in "EEXY.(XrarrY <= T^^" let "f:XrarrY;z:X" in "([[)t_(1):Y(]]))]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \llbracket \mathrm{fix}\left(\lambda \mathrm{f} . \lambda \mathrm{z} . \mathrm{t}_{1}\right): \mathrm{T} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{Z} .\left(\mathrm{fix} \preceq \mathrm{Z} \rightarrow \mathrm{T} \wedge \llbracket \lambda \mathrm{f} . \lambda \mathrm{z} \cdot \mathrm{t}_{1}: \mathrm{Z} \rrbracket\right) \tag{1}\\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} .\left(\mathrm{X} \rightarrow \mathrm{Y} \leq \mathrm{T} \wedge \llbracket \lambda f \mathrm{f} . \lambda \mathrm{z} . \mathrm{t}_{1}:(\mathrm{X} \rightarrow \mathrm{Y}) \rightarrow(\mathrm{X} \rightarrow \mathrm{Y}) \rrbracket\right) \tag{2}\\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} .\left(\mathrm{X} \rightarrow \mathrm{Y} \leq \mathrm{T} \wedge \text { let } \mathrm{f}: \mathrm{X} \rightarrow \mathrm{Y} ; \mathrm{z}: \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{Y} \rrbracket\right) \tag{3} \end{align*} let Γ0 in [[fix(λf.λz.t1):T]](1) let Γ0 in Z.(fixZT[[λf.λzt1:Z]])(2) let Γ0 in XY.(XYT[[λff.λz.t1:(XY)(XY)]])(3) let Γ0 in XY.(XYT let f:XY;z:X in [[t1:Y]])
where (1) is by definition of constraint generation; (2) is by definition of Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 (fix), by C-ARrow, and by Lemma 1.6.4; and (3) follows from Lemma 1.6.5. This allows us to write
let Γ 0 in [ [ let f = f i x ( λ f . λ z . t 1 ) in t 2 : T ] ] let Γ 0 ; f : z [ [ [ f i x ( λ f λ z t 1 ) : z ] ] ] . z in [ [ t 2 : T ] ] let Γ 0 ; f : Z [ X Y . ( X Y Z let f : X Y ; z : X in [ [ t 1 : Y ] ] ) ] . Z in [ [ t 2 : T ] ] let Γ 0 ; f : X Y [ let f : X Y ; z : X in [ [ t 1 : Y ] ] ] . X Y in [ [ t 2 : T ] ]  let  Γ 0  in  [ [  let  f = f i x λ f . λ z . t 1  in  t 2 : T ] ]  let  Γ 0 ; f : z [ [ f i x λ f λ z t 1 : z ] ] . z  in  [ [ t 2 : T ] ]  let  Γ 0 ; f : Z X Y . X Y Z  let  f : X Y ; z : X  in  [ [ t 1 : Y ] ] . Z  in  [ [ t 2 : T ] ]  let  Γ 0 ; f : X Y  let  f : X Y ; z : X  in  [ [ t 1 : Y ] ] . X Y  in  [ [ t 2 : T ] ] {:[" let "Gamma_(0)" in "[[" let "f=fix(lambda f.lambda z.t_(1))" in "t_(2):T]]],[-=" let "Gamma_(0);f:AAz[([[)fix(lambdaf*lambdaz*t_(1)):z(]])].z" in "[[t_(2):T]]],[-=" let "Gamma_(0);f:AAZ[EEXY.(XrarrY <= Z^^" let "f:XrarrY;z:X" in "([[)t_(1):Y(]]))].Z" in "[[t_(2):T]]],[-=" let "Gamma_(0);f:AAXY[" let "f:XrarrY;z:X" in "([[)t_(1):Y(]])].XrarrY" in "[[t_(2):T]]]:}\begin{aligned} & \text { let } \Gamma_{0} \text { in } \llbracket \text { let } f=f i x\left(\lambda f . \lambda z . t_{1}\right) \text { in } t_{2}: T \rrbracket \\ & \equiv \text { let } \Gamma_{0} ; f: \forall \mathrm{z}\left[\llbracket \mathrm{fix}\left(\lambda \mathrm{f} \cdot \lambda \mathrm{z} \cdot \mathrm{t}_{1}\right): \mathrm{z} \rrbracket\right] . \mathrm{z} \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \\ & \equiv \text { let } \Gamma_{0} ; f: \forall \mathrm{Z}\left[\exists \mathrm{XY} .\left(\mathrm{X} \rightarrow \mathrm{Y} \leq \mathrm{Z} \wedge \text { let } \mathrm{f}: \mathrm{X} \rightarrow \mathrm{Y} ; \mathrm{z}: \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{Y} \rrbracket\right)\right] . \mathrm{Z} \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \\ & \equiv \text { let } \Gamma_{0} ; f: \forall \mathrm{XY}\left[\text { let } \mathrm{f}: \mathrm{X} \rightarrow \mathrm{Y} ; \mathrm{z}: \mathrm{X} \text { in } \llbracket \mathrm{t}_{1}: \mathrm{Y} \rrbracket\right] . \mathrm{X} \rightarrow \mathrm{Y} \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \end{aligned} let Γ0 in [[ let f=fix(λf.λz.t1) in t2:T]] let Γ0;f:z[[[fix(λfλzt1):z]]].z in [[t2:T]] let Γ0;f:Z[XY.(XYZ let f:XY;z:X in [[t1:Y]])].Z in [[t2:T]] let Γ0;f:XY[ let f:XY;z:X in [[t1:Y]]].XY in [[t2:T]]
where (4) is by definition of constraint generation; (5) follows from C-LETDuP and from the previous series of equivalences; (6) is by C-LETEx, C-ExTrans and Lemma 1.3.22.
1.9.21 Solution: We have
match 1 with z t 2 : T ] ] (1) let x X [ [ [ t 1 : x ] ] let z : X in [ [ X : z ] ] ] ( z : X ) in [ [ t 2 : T ] ] (2) let z : X [ X ( [ [ t 1 : x ] ] X X ) ] X in [ [ t 2 : T ] ] (3) let z : X [ [ [ t 1 : X ] ] ] X in [ [ t 2 : T ] ] (4) [ [ let z = t 1 in t 2 : T ] ]  match  1  with  z t 2 : T ] ] (1)  let  x X [ [ t 1 : x ] ]  let  z : X  in  [ [ X : z ] ] z : X  in  [ [ t 2 : T ] ] (2)  let  z : X X [ [ t 1 : x ] ] X X X  in  [ [ t 2 : T ] ] (3)  let  z : X [ [ t 1 : X ] ] X  in  [ [ t 2 : T ] ] (4) [ [  let  z = t 1  in  t 2 : T ] ] {:[" match "_(1)" with "z*t_(2):T]]],[(1)-=" let "AAxX^(')[([[)t_(1):x(]])^^" let "z:X^(')" in "([[)X:z(]])]*(z:X^('))" in "[[t_(2):T]]],[(2)-=" let "z:AAX^(')[EEX*(([[)t_(1):x(]])^^X <= X^('))]*X^(')" in "[[t_(2):T]]],[(3)-=" let "z:AAX^(')[([[)t_(1):X^(')(]])]*X^(')" in "[[t_(2):T]]],[(4)-=[[" let "z=t_(1)" in "t_(2):T]]]:}\begin{align*} & \text { } \text { match }_{1} \text { with } z \cdot \mathrm{t}_{2}: \mathrm{T} \rrbracket \\ \equiv & \text { let } \forall \mathrm{xX^{ \prime }}\left[\llbracket \mathrm{t}_{1}: \mathrm{x} \rrbracket \wedge \text { let } \mathrm{z}: \mathrm{X}^{\prime} \text { in } \llbracket \mathrm{X}: \mathrm{z} \rrbracket\right] \cdot\left(\mathrm{z}: \mathrm{X}^{\prime}\right) \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \tag{1}\\ \equiv & \text { let } \mathrm{z}: \forall \mathrm{X}^{\prime}\left[\exists \mathrm{X} \cdot\left(\llbracket \mathrm{t}_{1}: \mathrm{x} \rrbracket \wedge \mathrm{X} \leq \mathrm{X}^{\prime}\right)\right] \cdot \mathrm{X}^{\prime} \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \tag{2}\\ \equiv & \text { let } \mathrm{z}: \forall \mathrm{X}^{\prime}\left[\llbracket \mathrm{t}_{1}: \mathrm{X}^{\prime} \rrbracket\right] \cdot \mathrm{X}^{\prime} \text { in } \llbracket \mathrm{t}_{2}: \mathrm{T} \rrbracket \tag{3}\\ \equiv & \llbracket \text { let } \mathrm{z}=\mathrm{t}_{1} \text { in } \mathrm{t}_{2}: \mathrm{T} \rrbracket \tag{4} \end{align*}  match 1 with zt2:T]](1) let xX[[[t1:x]] let z:X in [[X:z]]](z:X) in [[t2:T]](2) let z:X[X([[t1:x]]XX)]X in [[t2:T]](3) let z:X[[[t1:X]]]X in [[t2:T]](4)[[ let z=t1 in t2:T]]
where (1) is by definition of constraint generation for match; (2) is by definition of constraint generation for patterns, by C-InID, C-IN*, and C-LETEx; (3) is by Lemma 1.6 .4 ; ( 4 ) 1.6 .4 ; ( 4 ) 1.6.4;(4)1.6 .4 ;(4)1.6.4;(4) is by definition of constraint generation for let.
1.9.26 Solution: The type scheme X ¯ . T T X ¯ . T T AA bar(X).TrarrT\forall \overline{\mathrm{X}} . \mathrm{T} \rightarrow \mathrm{T}X¯.TT may be written X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯. [ X T ] ( X X ) [ X T ] ( X X ) [X|->T](XrarrX)[\mathrm{X} \mapsto \mathrm{T}](\mathrm{X} \rightarrow \mathrm{X})[XT](XX). Furthermore, X ¯ # X . X X X ¯ # X . X X bar(X)#AAX.XrarrX\overline{\mathrm{X}} \# \forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}X¯#X.XX holds. Thus, X ¯ . T T X ¯ . T T AA bar(X).TrarrT\forall \overline{\mathrm{X}} . \mathrm{T} \rightarrow \mathrm{T}X¯.TT is an instance of X . X X X . X X AAX.XrarrX\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}X.XX
in the sense of DM-INST'. Since DM-InsT' is an admissible rule for the type system DM, and since it is clear that the identity function λ z . z λ z . z lambda z.z\lambda z . zλz.z has type X . X X X . X X AAX.XrarrX\forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}X.XX, it must also have type X ¯ . T T X ¯ . T T AA bar(X).TrarrT\forall \overline{\mathrm{X}} . \mathrm{T} \rightarrow \mathrm{T}X¯.TT. (A more direct proof of this fact would not be difficult.) So, the destructor ( : X ¯ . T ) ( : X ¯ . T ) (*:EE bar(X).T)(\cdot: \exists \overline{\mathrm{X}} . \mathrm{T})(:X¯.T) has not only identity semantics, but also an identity type. This shows that our definitions are sound.
Let us now check requirement (i) of Definition 1.7.6. Since R-Annotation is pure, it suffices to show that let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ ( v : X ¯ . T ) : T ] ] [ [ ( v : X ¯ . T ) : T ] ] [[(v:EE bar(X).T):T^(')]]\llbracket(\mathrm{v}: \exists \overline{\mathrm{X}} . \mathrm{T}): \mathrm{T}^{\prime} \rrbracket[[(v:X¯.T):T]] entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ v : T ] ] [ [ v : T ] ] [[v:T^(')]]\llbracket \mathrm{v}: \mathrm{T}^{\prime} \rrbracket[[v:T]].
Now, we have
(3) let Γ 0 in [ [ ( v : X ¯ . T ) : T ] ] let Γ 0 in X X ¯ ( T T X T [ [ v : X ] ] ) let Γ 0 in X X ¯ ( X T T [ [ v : X ] ] ) let Γ 0 in [ [ v : T ] ] (3)  let  Γ 0  in  [ [ ( v : X ¯ . T ) : T ] ]  let  Γ 0  in  X X ¯ T T X T [ [ v : X ] ]  let  Γ 0  in  X X ¯ X T T [ [ v : X ] ]  let  Γ 0  in  [ [ v : T ] ] {:(3){:[," let "Gamma_(0)" in "[[(v:EE bar(X).T):T^(')]]],[-=," let "Gamma_(0)" in "EEX bar(X)*(TrarrT <= XrarrT^(')^^([[)v:X(]]))],[-=," let "Gamma_(0)" in "EEX bar(X)*(X <= T <= T^(')^^([[)v:X(]]))],[⊩," let "Gamma_(0)" in "[[v:T^(')]]]:}:}\begin{array}{ll} & \text { let } \Gamma_{0} \text { in } \llbracket(\mathrm{v}: \exists \overline{\mathrm{X}} . \mathrm{T}): \mathrm{T}^{\prime} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{X} \overline{\mathrm{X}} \cdot\left(\mathrm{T} \rightarrow \mathrm{T} \leq \mathrm{X} \rightarrow \mathrm{T}^{\prime} \wedge \llbracket \mathrm{v}: \mathrm{X} \rrbracket\right) \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{X} \overline{\mathrm{X}} \cdot\left(\mathrm{X} \leq \mathrm{T} \leq \mathrm{T}^{\prime} \wedge \llbracket \mathrm{v}: \mathrm{X} \rrbracket\right) \\ \Vdash & \text { let } \Gamma_{0} \text { in } \llbracket \mathrm{v}: \mathrm{T}^{\prime} \rrbracket \tag{3} \end{array}(3) let Γ0 in [[(v:X¯.T):T]] let Γ0 in XX¯(TTXT[[v:X]]) let Γ0 in XX¯(XTT[[v:X]]) let Γ0 in [[v:T]]
where (1) is by definition of constraint generation and by definition of Γ 0 ( ( Γ 0 ( ( Gamma_(0)((*\Gamma_{0}((\cdotΓ0(( : X ¯ . T ) X ¯ . T ) EE bar(X).T)\exists \overline{\mathrm{X}} . \mathrm{T})X¯.T) ); (2) is by C-ArRow; and (3) follows from Lemma 1.6.3 and C-EX*.

1.10.5 Solution: We have

let Γ 0 in z [ [ ( λ z z + ^ 1 ^ : X . X X ) : Z ] ] (1) let Γ 0 in z ( x [ [ λ z z + ^ : X X ] ] X . ( X X z ) ) (2) let Γ 0 in x . let z : x in [ [ z + ^ 1 ^ : x ] ] (3) x . ( int int int X int X ) (4) x . ( X = int ) (5) false  let  Γ 0  in  z [ [ ( λ z z + ^ 1 ^ : X . X X ) : Z ] ] (1)  let  Γ 0  in  z ( x [ [ λ z z + ^ : X X ] ] X . ( X X z ) ) (2)  let  Γ 0  in  x .  let  z : x  in  [ [ z + ^ 1 ^ : x ] ] (3) x . (  int   int   int  X  int  X ) (4) x . ( X =  int  ) (5)  false  {:[" let "Gamma_(0)" in "EEz*[[(lambdaz*z hat(+) hat(1):AAX.XrarrX):Z]]],[(1)-=" let "Gamma_(0)" in "EEz*(AAx*[[lambdaz*z hat(+):XrarrX]]^^EEX.(XrarrX <= z))],[(2)-=" let "Gamma_(0)" in "AAx." let "z:x" in "[[z hat(+) hat(1):x]]],[(3)-=AAx.(" int "rarr" int "rarr" int " <= Xrarr" int "rarrX)],[(4)-=AAx.(X=" int ")],[(5)-=" false "]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \exists \mathrm{z} \cdot \llbracket(\lambda \mathrm{z} \cdot \mathrm{z} \hat{+} \hat{1}: \forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}): \mathrm{Z} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{z} \cdot(\forall \mathrm{x} \cdot \llbracket \lambda \mathrm{z} \cdot \mathrm{z} \hat{+}: \mathrm{X} \rightarrow \mathrm{X} \rrbracket \wedge \exists \mathrm{X} .(\mathrm{X} \rightarrow \mathrm{X} \leq \mathrm{z})) \tag{1}\\ \equiv & \text { let } \Gamma_{0} \text { in } \forall \mathrm{x} . \text { let } \mathrm{z}: \mathrm{x} \text { in } \llbracket \mathrm{z} \hat{+} \hat{1}: \mathrm{x} \rrbracket \tag{2}\\ \equiv & \forall \mathrm{x} .(\text { int } \rightarrow \text { int } \rightarrow \text { int } \leq \mathrm{X} \rightarrow \text { int } \rightarrow \mathrm{X}) \tag{3}\\ \equiv & \forall \mathrm{x} .(\mathrm{X}=\text { int }) \tag{4}\\ \equiv & \text { false } \tag{5} \end{align*} let Γ0 in z[[(λzz+^1^:X.XX):Z]](1) let Γ0 in z(x[[λzz+^:XX]]X.(XXz))(2) let Γ0 in x. let z:x in [[z+^1^:x]](3)x.( int  int  int X int X)(4)x.(X= int )(5) false 
where (1) is by definition of constraint generation for universal type annotations; (2) is obtained by restricting the scope of Z Z EEZ\exists \mathrm{Z}Z to the second conjunct, then dropping the latter altogether, since it is equivalent to true, and by Lemma 1.6.5; (3) is obtained by definition of constraint generation, by definition of Γ 0 ( + ^ ) Γ 0 ( + ^ ) Gamma_(0)( hat(+))\Gamma_{0}(\hat{+})Γ0(+^) and of Γ 0 ( 1 ^ ) Γ 0 ( 1 ^ ) Gamma_(0)( hat(1))\Gamma_{0}(\hat{1})Γ0(1^), and by a few simple equivalence laws; (4) follows from C-ARrow and antisymmetry of subtyping; (5) follows from the fact that int and (say) int rarr\rightarrow int have distinct interpretations, since the type constructors int and rarr\rightarrow are incompatible. On the other hand, we have
let Γ 0 in Z [ [ ( λ z z : X . X X ) : Z ] ] (1) let Γ 0 in X . let z : X in [ [ z : X ] ] (2) X . ( X X ) (3) true  let  Γ 0  in  Z [ [ ( λ z z : X . X X ) : Z ] ] (1)  let  Γ 0  in  X .  let  z : X  in  [ [ z : X ] ] (2) X . ( X X ) (3)  true  {:[" let "Gamma_(0)" in "EEZ*[[(lambdaz*z:AAX.XrarrX):Z]]],[(1)-=" let "Gamma_(0)" in "AAX." let "z:X" in "[[z:X]]],[(2)-=AAX.(X <= X)],[(3)-=" true "]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \exists \mathrm{Z} \cdot \llbracket(\lambda \mathrm{z} \cdot \mathrm{z}: \forall \mathrm{X} . \mathrm{X} \rightarrow \mathrm{X}): \mathrm{Z} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \forall \mathrm{X} . \text { let } \mathrm{z}: \mathrm{X} \text { in } \llbracket \mathrm{z}: \mathrm{X} \rrbracket \tag{1}\\ \equiv & \forall \mathrm{X} .(\mathrm{X} \leq \mathrm{X}) \tag{2}\\ \equiv & \text { true } \tag{3} \end{align*} let Γ0 in Z[[(λzz:X.XX):Z]](1) let Γ0 in X. let z:X in [[z:X]](2)X.(XX)(3) true 
where (1) is obtained as above; (2) by definition of constraint generation, C-InID and C-IN*; (3) is by reflexivity of subtyping.
1.10.6 Solution: Under the naïve constraint generation rule for universal type variable introduction, the constraint [ [ X . ( λ z z : X X ) : Z ] ] [ [ X . ( λ z z : X X ) : Z ] ] [[AAX.(lambdaz*z:XrarrX):Z]]\llbracket \forall \mathrm{X} .(\lambda \mathrm{z} \cdot \mathrm{z}: \mathrm{X} \rightarrow \mathrm{X}): \mathrm{Z} \rrbracket[[X.(λzz:XX):Z]] is equivalent to X . ( [ [ λ z . z : X X ] ] X X Z ) X . ( [ [ λ z . z : X X ] ] X X Z ) AAX.([[lambda z.z:X rarr X]]^^XrarrX <= Z)\forall \mathrm{X} .(\llbracket \lambda z . z: X \rightarrow X \rrbracket \wedge \mathrm{X} \rightarrow \mathrm{X} \leq \mathrm{Z})X.([[λz.z:XX]]XXZ). Since the first conjunct is a tautology, this is in turn equivalent to X X AAX\forall \mathrm{X}X. ( X X Z ( X X Z (XrarrX <= Z(\mathrm{X} \rightarrow \mathrm{X} \leq \mathrm{Z}(XXZ ). In a nondegenerate free term model where subtyping is interpreted as equality, this constraint is unsatisfiable. In a non-structural subtyping model equipped with a least type _|_\perp and a greatest type TT\top, it is equivalent to ⊥→ Z ⊥→ Z _|_ rarr TT <= Z\perp \rightarrow \top \leq \mathrm{Z}⊥→Z. This is a pretty restrictive constraint: since no value has type _|_\perp, a function whose type is (a supertype of) ⊥→ T ⊥→ T _|_ rarr T\perp \rightarrow T⊥→T cannot ever be invoked at runtime. This situation is clearly unsatisfactory. Checking that X . [ [ λ z . z : X X ] ] X . [ [ λ z . z : X X ] ] AAX.[[lambda z.z:XrarrX]]\forall \mathrm{X} . \llbracket \lambda z . z: \mathrm{X} \rightarrow \mathrm{X} \rrbracketX.[[λz.z:XX]] holds was indeed part of our intent, but constraining Z Z Z\mathrm{Z}Z to be a supertype of X X X X XrarrX\mathrm{X} \rightarrow \mathrm{X}XX for every X X X\mathrm{X}X was not.

X ¯ # f t v ( T ) X ¯ # f t v T bar(X)#ftv(T^('))\overline{\mathrm{X}} \# \mathrm{ftv}\left(\mathrm{T}^{\prime}\right)X¯#ftv(T) (3). By (1), (2), (3), and by definition of constraint generation for local universal type annotations, [ [ ( t : X ¯ . T ) : T ] ] [ [ ( t : X ¯ . T ) : T ] ] [[(t:AA bar(X).T):T^(')]]\llbracket(t: \forall \overline{\mathrm{X}} . \mathrm{T}): \mathrm{T}^{\prime} \rrbracket[[(t:X¯.T):T]] is well-defined and is X ¯ . [ [ t : T ] ] X ¯ . ( T T ) X ¯ . [ [ t : T ] ] X ¯ . T T AA bar(X).[[t:T]]^^EE bar(X).(T <= T^('))\forall \overline{\mathrm{X}} . \llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \exists \overline{\mathrm{X}} .\left(\mathrm{T} \leq \mathrm{T}^{\prime}\right)X¯.[[t:T]]X¯.(TT) (4). By (3) and by definition of constraint generation for introduction of universal type variables and for general type annotations, [ [ X ¯ . ( t : T ) : T ] ] [ [ X ¯ . ( t : T ) : T ] ] [[AA bar(X).(t:T):T^(')]]\llbracket \forall \overline{\mathrm{X}} .(\mathrm{t}: \mathrm{T}): \mathrm{T}^{\prime} \rrbracket[[X¯.(t:T):T]] is X ¯ Z . ( [ [ t : T ] ] T Z ) X ¯ . ( [ [ t : T ] ] T T ) X ¯ Z . ( [ [ t : T ] ] T Z ) X ¯ . [ [ t : T ] ] T T AA bar(X)*EEZ.([[t:T]]^^T <= Z)^^EE bar(X).(([[)t:T(]])^^T <= T^('))\forall \overline{\mathrm{X}} \cdot \exists \mathrm{Z} .(\llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{Z}) \wedge \exists \overline{\mathrm{X}} .\left(\llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{T}^{\prime}\right)X¯Z.([[t:T]]TZ)X¯.([[t:T]]TT), where Z Z Z\mathrm{Z}Z is fresh, which we may immediately simplify to X ¯ X ¯ AA bar(X)\forall \overline{\mathrm{X}}X¯ [ [ t : T ] ] X ¯ [ [ t : T ] ] X ¯ [[t:T]]^^EE bar(X)\llbracket \mathrm{t}: \mathrm{T} \rrbracket \wedge \exists \overline{\mathrm{X}}[[t:T]]X¯. ( [ [ t : T ] ] T T ) [ [ t : T ] ] T T (([[)t:T(]])^^T <= T^('))\left(\llbracket t: \mathrm{T} \rrbracket \wedge \mathrm{T} \leq \mathrm{T}^{\prime}\right)([[t:T]]TT) (5). Using C-ExAnd and Lemma 1.10.1, it is straightforward to check that (4) and (5) are equivalent.
1.10.9 Solution: We have
(2) Z [ [ λ z X ( z : x ) : z ] ] Z 1 Z 2 let z : Z 1 in [ [ X . ( z : x ) : Z 2 ] ] IF Z 1 X ( Z 1 X ) (2) Z [ [ λ z X ( z : x ) : z ] ] Z 1 Z 2  let  z : Z 1  in  [ [ X . ( z : x ) : Z 2 ] ]  IF  Z 1 X Z 1 X {:(2){:[,EEZ*[[lambdaz*AAX*(z:x):z]]],[⊩,EEZ_(1)Z_(2)*" let "z:Z_(1)" in "[[AAX.(z:x):Z_(2)]]],[" IF ",EEZ_(1)*AAX*(Z_(1) <= X)]:}:}\begin{array}{ll} & \exists \mathrm{Z} \cdot \llbracket \lambda \mathrm{z} \cdot \forall \mathrm{X} \cdot(\mathrm{z}: \mathrm{x}): \mathrm{z \rrbracket} \\ \Vdash & \exists \mathrm{Z}_{1} \mathrm{Z}_{2} \cdot \text { let } \mathrm{z}: \mathrm{Z}_{1} \text { in } \llbracket \forall \mathrm{X} .(\mathrm{z}: \mathrm{x}): \mathrm{Z}_{2} \rrbracket \\ \text { IF } & \exists \mathrm{Z}_{1} \cdot \forall \mathrm{X} \cdot\left(\mathrm{Z}_{1} \leq \mathrm{X}\right) \tag{2} \end{array}(2)Z[[λzX(z:x):z]]Z1Z2 let z:Z1 in [[X.(z:x):Z2]] IF Z1X(Z1X)
where (1) is by definition of constraint generation for λ λ lambda\lambdaλ-abstractions, dropping the constraint that relates Z , Z 1 Z , Z 1 Z,Z_(1)\mathrm{Z}, \mathrm{Z}_{1}Z,Z1, and Z 2 ; ( 2 ) Z 2 ; ( 2 ) Z_(2);(2)\mathrm{Z}_{2} ;(2)Z2;(2) is by definition of constraint generation for universal type variable introduction, this time dropping information about Z 2 Z 2 Z_(2)\mathrm{Z}_{2}Z2. Now, in a nondegenerate equality model, the constraint (2) is equivalent to false. In fact, for (2) to be satisfiable, the interpretation of subtyping must admit a least element _|_\perp. We now see that [ [ λ z . X . ( z : X ) : Z ] ] [ [ λ z . X . ( z : X ) : Z ] ] [[lambda z.AAX.(z:X):Z]]\llbracket \lambda z . \forall \mathrm{X} .(\mathrm{z}: \mathrm{X}): \mathrm{Z} \rrbracket[[λz.X.(z:X):Z]] is a very restrictive constraint. Indeed, it requires z z zzz to have every type at once. Because z z zzz is λ λ lambda\lambdaλ-bound-hence monomorphic-it must in fact have type _|_\perp. On the other hand, we have
Z [ [ X λ z ( z : X ) : Z ] ] (1) x Z [ [ λ z ( z : x ) : Z ] ] (2) X Z Z Z 1 ( Z 2 X X Z 2 Z 1 Z 2 Z ) (3) true Z [ [ X λ z ( z : X ) : Z ] ] (1) x Z [ [ λ z ( z : x ) : Z ] ] (2) X Z Z Z 1 Z 2 X X Z 2 Z 1 Z 2 Z (3)  true  {:[EEZ*[[AAX*lambdaz*(z:X):Z]]],[(1)-=AAx*EEZ*[[lambdaz*(z:x):Z]]],[(2)-=AAX*EEZZZ_(1)*(Z_(2) <= X^^X <= Z_(2)^^Z_(1)rarrZ_(2) <= Z)],[(3)-=" true "]:}\begin{align*} & \exists \mathrm{Z} \cdot \llbracket \forall \mathrm{X} \cdot \lambda \mathrm{z} \cdot(\mathrm{z}: \mathrm{X}): \mathrm{Z} \rrbracket \\ \equiv & \forall \mathrm{x} \cdot \exists \mathrm{Z} \cdot \llbracket \lambda \mathrm{z} \cdot(\mathrm{z}: \mathrm{x}): \mathrm{Z} \rrbracket \tag{1}\\ \equiv & \forall \mathrm{X} \cdot \exists \mathrm{ZZ} \mathrm{Z}_{1} \cdot\left(\mathrm{Z}_{2} \leq \mathrm{X} \wedge \mathrm{X} \leq \mathrm{Z}_{2} \wedge \mathrm{Z}_{1} \rightarrow \mathrm{Z}_{2} \leq \mathrm{Z}\right) \tag{2}\\ \equiv & \text { true } \tag{3} \end{align*}Z[[Xλz(z:X):Z]](1)xZ[[λz(z:x):Z]](2)XZZZ1(Z2XXZ2Z1Z2Z)(3) true 
where (1) is by definition of constraint generation for universal type variable introduction, dropping the second conjunct, which is entailed by the first; (2) is by Lemma 1.6.5, by definition of constraint generation for general type annotations, and by a few simple equivalence laws; (3) follows from C-NAMEEQ and the witness substitution [ Z 1 X , Z 2 X , Z ( X X ) ] Z 1 X , Z 2 X , Z ( X X ) [Z_(1)|->X,Z_(2)|->X,Z|->(XrarrX)]\left[\mathrm{Z}_{1} \mapsto \mathrm{X}, \mathrm{Z}_{2} \mapsto \mathrm{X}, \mathrm{Z} \mapsto(\mathrm{X} \rightarrow \mathrm{X})\right][Z1X,Z2X,Z(XX)].
let f : x [ [ [ f i x f : S . λ z . t 1 : x ] ] ] . x in [ [ t 2 : T ] ] let f : X [ let f : S in [ [ λ z . t 1 : S ] ] S X ] . X in [ [ t 2 : T ] ] let f : S in [ [ λ z . t 1 : S ] ] let f : x [ S x ] . x in [ [ t 2 : T ] ] let f : S in ( [ [ λ z t 1 : S ] ] [ [ t 2 : T ] ] )  let  f : x [ [ f i x f : S . λ z . t 1 : x ] ] . x  in  [ [ t 2 : T ] ]  let  f : X  let  f : S  in  [ [ λ z . t 1 : S ] ] S X . X  in  [ [ t 2 : T ] ]  let  f : S  in  [ [ λ z . t 1 : S ] ]  let  f : x [ S x ] . x  in  [ [ t 2 : T ] ]  let  f : S  in  [ [ λ z t 1 : S ] ] [ [ t 2 : T ] ] {:[-=" let "f:AA x[([[)fixf:S.lambda z.t_(1):x(]])].x" in "[[t_(2):T]]],[-=" let "f:AA X[" let "f:S" in "([[)lambda z.t_(1):S(]])^^S-<=X].X" in "[[t_(2):T]]],[-=" let "f:S" in "[[lambda z.t_(1):S]]^^" let "f:AA x[S-<=x].x" in "[[t_(2):T]]],[-=" let "f:S" in "(([[)lambda z*t_(1):S(]])^^([[)t_(2):T(]]))]:}\begin{aligned} & \equiv \text { let } f: \forall x\left[\llbracket f i x f: S . \lambda z . t_{1}: x \rrbracket\right] . x \text { in } \llbracket t_{2}: T \rrbracket \\ & \equiv \text { let } f: \forall X\left[\text { let } f: S \text { in } \llbracket \lambda z . t_{1}: S \rrbracket \wedge S \preceq X\right] . X \text { in } \llbracket t_{2}: T \rrbracket \\ & \equiv \text { let } f: S \text { in } \llbracket \lambda z . t_{1}: S \rrbracket \wedge \text { let } f: \forall x[S \preceq x] . x \text { in } \llbracket t_{2}: T \rrbracket \\ & \equiv \text { let } f: S \text { in }\left(\llbracket \lambda z \cdot t_{1}: S \rrbracket \wedge \llbracket t_{2}: T \rrbracket\right) \end{aligned} let f:x[[[fixf:S.λz.t1:x]]].x in [[t2:T]] let f:X[ let f:S in [[λz.t1:S]]SX].X in [[t2:T]] let f:S in [[λz.t1:S]] let f:x[Sx].x in [[t2:T]] let f:S in ([[λzt1:S]][[t2:T]])
where (1) is by definition of the letrec syntactic sugar and by the definition of constraint generation for let constructs; we have X f t v ( S , t 1 ) X f t v S , t 1 X!in ftv((S),t_(1))\mathrm{X} \notin f t v\left(\mathrm{~S}, \mathrm{t}_{1}\right)Xftv( S,t1); (2) is by definition of constraint generation for fix; (3) is by C-LETAND; (4) follows from the equivalence between the type schemes X [ S X ] . X X [ S X ] . X AAX[S-<=X].X\forall \mathrm{X}[\mathrm{S} \preceq \mathrm{X}] . \mathrm{X}X[SX].X and S S S\mathrm{S}S - which itself is a direct consequence of C-ExTRAns-and from C-InAnd.
1.11.16 Solution: We reason simultaneously in both the subtyping model or the equal-only model, that is, we only rely on properties that are valid in both models.
We must first ensure that rules RD-DEFAULT, RD-Found, and RD-FolLow respect (Definition 1.7.5).
  • Case RD-DEfault. The reduction is { v } . { } δ v { v } . { } δ v {v}.{ℓ}rarr"delta"v\{\mathrm{v}\} .\{\ell\} \xrightarrow{\delta} \mathrm{v}{v}.{}δv, which is pure. Therefore, it is sufficient to establish that let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ { v } { } : T ] ] [ [ { v } { } : T ] ] [[{v}*{ℓ}:T]]\llbracket\{\mathrm{v}\} \cdot\{\ell\}: \mathrm{T} \rrbracket[[{v}{}:T]] entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ v : T ] ] [ [ v : T ] ] [[v:T]]\llbracket \mathrm{v}: \mathrm{T} \rrbracket[[v:T]]. In fact, we have:
(5) let Γ 0 in [ [ { v } { } : T ] ] let Γ 0 in X Y . ( . { } X T { } Y X [ [ v : Y ] ] ) let Γ 0 in X Y . ( X 1 X 2 ( Π ( : X 1 ; X 2 ) X 1 X T ) Y 1 ( Y 1 Π ( Y 1 ) Y X ) [ [ v : Y ] ] ) let Γ 0 in X 2 Y . ( Y ( : X 1 ; X 2 ) X 1 T [ [ v : Y ] ] ) let Γ 0 in Y . ( Y X 1 X 1 T [ [ v : Y ] ] ) let Γ 0 in [ [ v : T ] ] (5)  let  Γ 0  in  [ [ { v } { } : T ] ]  let  Γ 0  in  X Y . ( . { } X T { } Y X [ [ v : Y ] ] )  let  Γ 0  in  X Y . X 1 X 2 Π : X 1 ; X 2 X 1 X T Y 1 Y 1 Π Y 1 Y X [ [ v : Y ] ]  let  Γ 0  in  X 2 Y . Y : X 1 ; X 2 X 1 T [ [ v : Y ] ]  let  Γ 0  in  Y . Y X 1 X 1 T [ [ v : Y ] ]  let  Γ 0  in  [ [ v : T ] ] {:(5){:[," let "Gamma_(0)" in "[[{v}*{ℓ}:T]]],[-=," let "Gamma_(0)" in "EEXY.(*.{ℓ}-<=XrarrT^^{*}-<=YrarrX^^[[v:Y]])],[-=," let "Gamma_(0)" in "EEXY.(EEX_(1)X_(2)*(Pi(ℓ:X_(1);X_(2))rarrX_(1) <= XrarrT):}],[,{:^^EEY_(1)*(Y_(1)rarr Pi(delY_(1)) <= YrarrX)^^([[)v:Y(]]))],[⊩," let "Gamma_(0)" in "EEX_(2)Y.(delY <= (ℓ:X_(1);X_(2))^^X_(1) <= T^^([[)v:Y(]]))],[⊩," let "Gamma_(0)" in "EEY.(Y <= X_(1)^^X_(1) <= T^^([[)v:Y(]]))],[⊩," let "Gamma_(0)" in "[[v:T]]]:}:}\begin{array}{cc} & \text { let } \Gamma_{0} \text { in } \llbracket\{\mathrm{v}\} \cdot\{\ell\}: \mathrm{T} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} .(\cdot .\{\ell\} \preceq \mathrm{X} \rightarrow \mathrm{T} \wedge\{\cdot\} \preceq \mathrm{Y} \rightarrow \mathrm{X} \wedge \llbracket \mathrm{v}: \mathrm{Y} \rrbracket) \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XY} .\left(\exists \mathrm{X}_{1} \mathrm{X}_{2} \cdot\left(\Pi\left(\ell: \mathrm{X}_{1} ; \mathrm{X}_{2}\right) \rightarrow \mathrm{X}_{1} \leq \mathrm{X} \rightarrow \mathrm{T}\right)\right. \\ & \left.\wedge \exists \mathrm{Y}_{1} \cdot\left(\mathrm{Y}_{1} \rightarrow \Pi\left(\partial \mathrm{Y}_{1}\right) \leq \mathrm{Y} \rightarrow \mathrm{X}\right) \wedge \llbracket \mathrm{v}: \mathrm{Y} \rrbracket\right) \\ \Vdash & \text { let } \Gamma_{0} \text { in } \exists \mathrm{X}_{2} \mathrm{Y} .\left(\partial \mathrm{Y} \leq\left(\ell: \mathrm{X}_{1} ; \mathrm{X}_{2}\right) \wedge \mathrm{X}_{1} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: \mathrm{Y} \rrbracket\right) \\ \Vdash & \text { let } \Gamma_{0} \text { in } \exists \mathrm{Y} .\left(\mathrm{Y} \leq \mathrm{X}_{1} \wedge \mathrm{X}_{1} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: \mathrm{Y} \rrbracket\right) \\ \Vdash & \text { let } \Gamma_{0} \text { in } \llbracket \mathrm{v}: \mathrm{T} \rrbracket \tag{5} \end{array}(5) let Γ0 in [[{v}{}:T]] let Γ0 in XY.(.{}XT{}YX[[v:Y]]) let Γ0 in XY.(X1X2(Π(:X1;X2)X1XT)Y1(Y1Π(Y1)YX)[[v:Y]]) let Γ0 in X2Y.(Y(:X1;X2)X1T[[v:Y]]) let Γ0 in Y.(YX1X1T[[v:Y]]) let Γ0 in [[v:T]]
where (1) is by definition of constraint generation; (2) is by definition of Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0, C-InId; (3) by variances of Π , Π , Pi,ℓ\Pi, \ellΠ,, and rarr\rightarrow, C-And, C-Ex*, C-ExAnd; (4) by C-Row-DL and covariance of \ell; (5) by Lemma 1.6.3.
○ Case RD-Found: The reduction is { w { w {w\{\mathrm{w}{w with = v } . { } δ v = v } . { } δ v ℓ=v}.{ℓ}rarr"delta"v\ell=\mathrm{v}\} .\{\ell\} \xrightarrow{\delta} \mathrm{v}=v}.{}δv. It suffices to establish let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ { w [ [ { w [[{w\llbracket\{\mathrm{w}[[{w with = v } { } : T ] ] = v } { } : T ] ] ℓ=v}*{ℓ}:T]]\ell=\mathrm{v}\} \cdot\{\ell\}: \mathrm{T} \rrbracket=v}{}:T]] entails let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ v : T ] ] [ [ v : T ] ] [[v:T]]\llbracket \mathrm{v}: \mathrm{T} \rrbracket[[v:T]]. In fact, we
have:
let Γ 0 in [ [ { w with = v } { } : T ] ] let Γ 0 in X Y Y ( : { } X T { with = } Y Y X (1) [ [ w : Y ] ] [ [ v : Y ] ] ) let Γ 0 in X Y Y ( X 1 X 2 ( Π ( : X 1 ; X 2 ) X 1 X T ) Y 1 Y 2 Y 3 ( Π ( : Y 1 ; Y 3 ) Y 2 Π ( : Y 2 ; Y 3 ) Y Y X ) (2) [ [ w : Y ] ] [ [ v : Y ] ] ) (3) I let Γ 0 in Y X 1 Y 2 ( Y Y 2 Y 2 X 1 X 1 T [ [ v : Y ] ] ) (4) I let Γ 0 in [ [ v : T ] ]  let  Γ 0  in  [ [ { w  with  = v } { } : T ] ]  let  Γ 0  in  X Y Y : { } X T {  with  = } Y Y X (1) [ [ w : Y ] ] [ [ v : Y ] ]  let  Γ 0  in  X Y Y X 1 X 2 Π : X 1 ; X 2 X 1 X T Y 1 Y 2 Y 3 Π : Y 1 ; Y 3 Y 2 Π : Y 2 ; Y 3 Y Y X (2) [ [ w : Y ] ] [ [ v : Y ] ] (3) I  let  Γ 0  in  Y X 1 Y 2 Y Y 2 Y 2 X 1 X 1 T [ [ v : Y ] ] (4) I  let  Γ 0  in  [ [ v : T ] ] {:[" let "Gamma_(0)" in "[[{w" with "ℓ=v}*{ℓ}:T]]],[-=" let "Gamma_(0)" in "EEXYY^(')*(*:{ℓ}-<=XrarrT^^{*" with "ℓ=*}-<=YrarrY^(')rarrX^^:}],[(1)^^{:^^([[)w:Y(]])^^([[)v:Y^(')(]]))],[-=" let "Gamma_(0)" in "EEXYY^(')*(EEX_(1)X_(2)*(Pi(ℓ:X_(1);X_(2))rarrX_(1) <= XrarrT):}],[^^EEY_(1)Y_(2)Y_(3)*(Pi(ℓ:Y_(1);Y_(3))rarrY_(2)rarr Pi(ℓ:Y_(2);Y_(3)) <= YrarrY^(')rarrX)],[(2){:^^([[)w:Y(]])^^([[)v:Y^(')(]]))],[(3)I" let "Gamma_(0)" in "EEY^(')X_(1)Y_(2)*(Y^(') <= Y_(2)^^Y_(2) <= X_(1)^^X_(1) <= T^^([[)v:Y^(')(]]))],[(4)I" let "Gamma_(0)" in "[[v:T]]]:}\begin{align*} & \text { let } \Gamma_{0} \text { in } \llbracket\{\mathrm{w} \text { with } \ell=\mathrm{v}\} \cdot\{\ell\}: \mathrm{T} \rrbracket \\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XYY}^{\prime} \cdot\left(\cdot:\{\ell\} \preceq \mathrm{X} \rightarrow \mathrm{T} \wedge\{\cdot \text { with } \ell=\cdot\} \preceq \mathrm{Y} \rightarrow \mathrm{Y}^{\prime} \rightarrow \mathrm{X} \wedge\right. \\ \wedge & \left.\wedge \llbracket \mathrm{w}: \mathrm{Y} \rrbracket \wedge \llbracket \mathrm{v}: \mathrm{Y}^{\prime} \rrbracket\right) \tag{1}\\ \equiv & \text { let } \Gamma_{0} \text { in } \exists \mathrm{XYY}^{\prime} \cdot\left(\exists \mathrm{X}_{1} \mathrm{X}_{2} \cdot\left(\Pi\left(\ell: \mathrm{X}_{1} ; \mathrm{X}_{2}\right) \rightarrow \mathrm{X}_{1} \leq \mathrm{X} \rightarrow \mathrm{T}\right)\right. \\ & \wedge \exists \mathrm{Y}_{1} \mathrm{Y}_{2} \mathrm{Y}_{3} \cdot\left(\Pi\left(\ell: \mathrm{Y}_{1} ; \mathrm{Y}_{3}\right) \rightarrow \mathrm{Y}_{2} \rightarrow \Pi\left(\ell: \mathrm{Y}_{2} ; \mathrm{Y}_{3}\right) \leq \mathrm{Y} \rightarrow \mathrm{Y}^{\prime} \rightarrow \mathrm{X}\right) \\ & \left.\wedge \llbracket \mathrm{w}: \mathrm{Y} \rrbracket \wedge \llbracket \mathrm{v}: \mathrm{Y}^{\prime} \rrbracket\right) \tag{2}\\ I & \text { let } \Gamma_{0} \text { in } \exists \mathrm{Y}^{\prime} \mathrm{X}_{1} \mathrm{Y}_{2} \cdot\left(\mathrm{Y}^{\prime} \leq \mathrm{Y}_{2} \wedge \mathrm{Y}_{2} \leq \mathrm{X}_{1} \wedge \mathrm{X}_{1} \leq \mathrm{T} \wedge \llbracket \mathrm{v}: \mathrm{Y}^{\prime} \rrbracket\right) \tag{3}\\ I & \text { let } \Gamma_{0} \text { in } \llbracket \mathrm{v}: \mathrm{T} \rrbracket \tag{4} \end{align*} let Γ0 in [[{w with =v}{}:T]] let Γ0 in XYY(:{}XT{ with =}YYX(1)[[w:Y]][[v:Y]]) let Γ0 in XYY(X1X2(Π(:X1;X2)X1XT)Y1Y2Y3(Π(:Y1;Y3)Y2Π(:Y2;Y3)YYX)(2)[[w:Y]][[v:Y]])(3)I let Γ0 in YX1Y2(YY2Y2X1X1T[[v:Y]])(4)I let Γ0 in [[v:T]]
where (1) is by definition of constraint generation; (2) is by definition of Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0, C-InID; (3) by variances of Π , Π , Pi,ℓ\Pi, \ellΠ,, and rarr\rightarrow, C-And, C-Ex*, C-ExAnd; (4) by Lemma 1.6.3.
  • Case RD-Follow The proof is similar to the previous case.
We must now check that if the configuration F v 1 v k / μ F v 1 v k / μ Fv_(1)dotsv_(k)//muF \mathrm{v}_{1} \ldots \mathrm{v}_{k} / \muFv1vk/μ is is well-typed, then either it is reducible, or it is a value.
We begin by checking that every value that is well-typed with type Π T Π T PiT\Pi \mathrm{T}ΠT is a record value, that is, either of the form { v } v {v^(')}\left\{\mathrm{v}^{\prime}\right\}{v} or { v v {v^(''):}\left\{\mathrm{v}^{\prime \prime}\right.{v with = v } = v {:ℓ^(')=v^(')}\left.\ell^{\prime}=\mathrm{v}^{\prime}\right\}=v}. Indeed, suppose that let Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 in [ [ v : Π T ] ] [ [ v : Π T ] ] [[v:PiT]]\llbracket \mathrm{v}: \Pi \mathrm{T} \rrbracket[[v:ΠT]] is satisfiable. Then, v v vvv cannot be a program variable, for a well-typed value must be closed; v v vvv cannot be a memory location m m mmm, for otherwise ref M ( m ) Π T M ( m ) Π T M(m) <= PiTM(m) \leq \Pi \mathrm{T}M(m)ΠT would be satisfiable - but the top type constructors ref and Π Π Pi\PiΠ are incompatible (since Π Π Pi\PiΠ is isolated); v cannot be a partial application of a constructor or a primitive, nor a λ λ lambda\lambdaλ-abstraction, since otherwise T T Π T T T Π T T^(')rarrT^('') <= PiT\mathrm{T}^{\prime} \rightarrow \mathrm{T}^{\prime \prime} \leq \Pi \mathrm{T}TTΠT would be satisfiable but the top type constructors rarr\rightarrow and Π Π Pi\PiΠ are incompatible (since they are both isolated); thus v must either be of the form { v } { v } {v}\{\mathrm{v}\}{v} or { w { w {w\{\mathrm{w}{w with = v } = v } ℓ=v}\ell=\mathrm{v}\}=v}, for these are the only left cases.
Next, we note that, according to the constraint generation rules, if the configuration c 1 v k / μ c 1 v k / μ c_(1)dotsv_(k)//mu\mathrm{c}_{1} \ldots \mathrm{v}_{k} / \muc1vk/μ is well-typed, then a constraint of the form let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in ( c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] ) c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] (c-<=x_(1)rarr dots rarrx_(k)rarrT^^([[)v_(1):x_(1)(]])^^dots^^([[)v_(k):x_(k)(]]))\left(\mathrm{c} \preceq \mathrm{x}_{1} \rightarrow \ldots \rightarrow \mathrm{x}_{k} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{v}_{1}: \mathrm{x}_{1} \rrbracket \wedge \ldots \wedge \llbracket \mathrm{v}_{k}: \mathrm{x}_{k} \rrbracket\right)(cx1xkT[[v1:x1]][[vk:xk]]) is satisfiable. We now reason by cases on c c c\mathrm{c}c.
@\circ Case c c c\mathrm{c}c is { } { } {*}\{\cdot\}{}. We may asume k 2 k 2 k >= 2k \geq 2k2, since otherwise, the expression is a value. Then Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c) is X Y . X Π ( X ) X Y . X Π ( X ) AAXY.Xrarr Pi(delX)\forall \mathrm{XY} . \mathrm{X} \rightarrow \Pi(\partial \mathrm{X})XY.XΠ(X), so by C-INID and C-ARROw the above constraint entails X . ( Π ( X ) X 2 T ) X . Π ( X ) X 2 T EEX.(Pi(delX) <= X_(2)rarr dots rarrT)\exists \mathrm{X} .\left(\Pi(\partial \mathrm{X}) \leq \mathrm{X}_{2} \rightarrow \ldots \rightarrow \mathrm{T}\right)X.(Π(X)X2T), which by C-Class-I entails false since rarr\rightarrow and Π Π Pi\PiΠ are imcompatible. Thus, this case cannot occur.
@\circ Case c c c\mathrm{c}c is { { {*\{\cdot{ with = } = } ℓ=*}\ell=\cdot\}=}. Similar to the previous case.
@\circ Case c c c\mathrm{c}c is { } { } *{ℓ}\cdot\{\ell\}{}. We may asume k 1 k 1 k >= 1k \geq 1k1, since otherwise, the expression is a value. Then Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c) is X Y . Π ( : X ; Y ) X X Y . Π ( : X ; Y ) X AAXY.Pi(ℓ:X;Y)rarrX\forall \mathrm{XY} . \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X}XY.Π(:X;Y)X, so by C-INID and C-ARRow the above constraint entails let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in ( X Y . ( X 1 Π ( : X ; Y ) ) [ [ v 1 : X 1 ] ] ) X Y . X 1 Π ( : X ; Y ) [ [ v 1 : X 1 ] ] (EEXY.(X_(1) <= Pi(ℓ:X;Y))^^([[)v_(1):X_(1)(]]))\left(\exists \mathrm{XY} .\left(\mathrm{X}_{1} \leq \Pi(\ell: \mathrm{X} ; \mathrm{Y})\right) \wedge \llbracket \mathrm{v}_{1}: \mathrm{X}_{1} \rrbracket\right)(XY.(X1Π(:X;Y))[[v1:X1]]), which by lemma 1.6.3 entails let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in X Y . [ [ v 1 : Π ( : X ; Y ) ] ] X Y . [ [ v 1 : Π ( : X ; Y ) ] ] EEXY.[[v_(1):Pi(ℓ:X;Y)]]\exists \mathrm{XY} . \llbracket \mathrm{v}_{1}: \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rrbracketXY.[[v1:Π(:X;Y)]]. Thus v 1 v 1 v_(1)\mathrm{v}_{1}v1 is a record value, that is, either of the form { v } v {v^(')}\left\{\mathrm{v}^{\prime}\right\}{v} and the configuration is reducible
to v v v^(')\mathrm{v}^{\prime}v or of the form { v v {v^(''):}\left\{\mathrm{v}^{\prime \prime}\right.{v with = v } = v {:ℓ^(')=v^(')}\left.\ell^{\prime}=\mathrm{v}^{\prime}\right\}=v} and the configuration is reducible to either v v v^(')\mathrm{v}^{\prime}v or v . { } v . { } v^('').{ℓ}\mathrm{v}^{\prime \prime} .\{\ell\}v.{}.
1.11.17 Solution: We add a collection of destructors [ 1 2 ] 1 2 *[ℓ_(1)harrℓ_(2)]\cdot\left[\ell_{1} \leftrightarrow \ell_{2}\right][12] of arity 1 for all pairs of distinct labels, with the following semantics:
{ v } [ 1 2 ] δ v { w with = v } [ 1 2 ] δ { w [ 1 2 ] with = v } if { 1 , 2 } { w with = v } [ 1 2 ] δ { w [ 1 2 ] with ¯ = v } if { , ¯ } = { 1 , 2 } { v } 1 2 δ v { w  with  = v } 1 2 δ w 1 2  with  = v  if  1 , 2 { w  with  = v } 1 2 δ w 1 2  with  ¯ = v  if  { , ¯ } = 1 , 2 {:[{v}[ℓ_(1)harrℓ_(2)]rarr"delta"v],[{w" with "ℓ=v}[ℓ_(1)harrℓ_(2)]rarr"delta"{w[ℓ_(1)harrℓ_(2)]" with "ℓ=v}," if "ℓ!in{ℓ_(1),ℓ_(2)}],[{w" with "ℓ=v}[ℓ_(1)harrℓ_(2)]rarr"delta"{w[ℓ_(1)harrℓ_(2)]" with "( bar(ℓ))=v}," if "{ℓ"," bar(ℓ)}={ℓ_(1),ℓ_(2)}]:}\begin{array}{rr} \{\mathrm{v}\}\left[\ell_{1} \leftrightarrow \ell_{2}\right] \xrightarrow{\delta} \mathrm{v} \\ \{\mathrm{w} \text { with } \ell=\mathrm{v}\}\left[\ell_{1} \leftrightarrow \ell_{2}\right] \xrightarrow{\delta}\left\{\mathrm{w}\left[\ell_{1} \leftrightarrow \ell_{2}\right] \text { with } \ell=\mathrm{v}\right\} & \text { if } \ell \notin\left\{\ell_{1}, \ell_{2}\right\} \\ \{\mathrm{w} \text { with } \ell=\mathrm{v}\}\left[\ell_{1} \leftrightarrow \ell_{2}\right] \xrightarrow{\delta}\left\{\mathrm{w}\left[\ell_{1} \leftrightarrow \ell_{2}\right] \text { with } \bar{\ell}=\mathrm{v}\right\} & \text { if }\{\ell, \bar{\ell}\}=\left\{\ell_{1}, \ell_{2}\right\} \end{array}{v}[12]δv{w with =v}[12]δ{w[12] with =v} if {1,2}{w with =v}[12]δ{w[12] with ¯=v} if {,¯}={1,2}
The initial environment Γ 0 Γ 0 Gamma_(0)\Gamma_{0}Γ0 must be extended with the following typing asumption:
[ 1 2 ] : X 1 X 2 Y Π ( 1 : X 1 ; 2 : X 2 ; Y ) Π ( 1 : X 2 ; 2 : X 1 ; Y ) 1 2 : X 1 X 2 Y Π 1 : X 1 ; 2 : X 2 ; Y Π 1 : X 2 ; 2 : X 1 ; Y *[ℓ_(1)harrℓ_(2)]:quad AAX_(1)X_(2)Y*Pi(ℓ_(1):X_(1);ℓ_(2):X_(2);Y)rarr Pi(ℓ_(1):X_(2);ℓ_(2):X_(1);Y)\cdot\left[\ell_{1} \leftrightarrow \ell_{2}\right]: \quad \forall \mathrm{X}_{1} \mathrm{X}_{2} \mathrm{Y} \cdot \Pi\left(\ell_{1}: \mathrm{X}_{1} ; \ell_{2}: \mathrm{X}_{2} ; \mathrm{Y}\right) \rightarrow \Pi\left(\ell_{1}: \mathrm{X}_{2} ; \ell_{2}: \mathrm{X}_{1} ; \mathrm{Y}\right)[12]:X1X2YΠ(1:X1;2:X2;Y)Π(1:X2;2:X1;Y)
We must then check subjection reduction for the new primitive. Since we only added a constructor, it sufficies to check progress for the new primitive, that is, that well-typed expressions of the form [ 1 2 ] v 1 v n 1 2 v 1 v n [ℓ_(1)harrℓ_(2)]v_(1)dotsv_(n)\left[\ell_{1} \leftrightarrow \ell_{2}\right] \mathrm{v}_{1} \ldots \mathrm{v}_{n}[12]v1vn are either value or can be further reduced. Both parts are easy and similar to the corresponding parts in Exercice 1.11.16.
1.11.18 Solution: There are several solutions. One of them is to asume a fixed total ordering on row-labels, and to retain as constructors only κ , L κ , L ℓ^(kappa,L)\ell^{\kappa, L}κ,L such that < L < L ℓ < L\ell<L<L, that is < < ℓ < ℓ^(')\ell<\ell^{\prime}< for all L L ℓ^(')in L\ell^{\prime} \in LL; other constants κ , L κ , L ℓ^(kappa,L)\ell^{\kappa, L}κ,L such that L L ℓ≮L\ell \nless LL are moved from constructors to the status of destructors with the following collection of reduction rules:
{ { w with = v } with = v } δ { { w with = v } with = v } w  with  = v  with  = v δ { w  with  = v }  with  = v {{w" with "ℓ^(')=v^(')}" with "ℓ=v}rarr"delta"{{w" with "ℓ=v}" with "ℓ^(')=v^(')}\left\{\left\{\mathrm{w} \text { with } \ell^{\prime}=\mathrm{v}^{\prime}\right\} \text { with } \ell=\mathrm{v}\right\} \xrightarrow{\delta}\left\{\{\mathrm{w} \text { with } \ell=\mathrm{v}\} \text { with } \ell^{\prime}=\mathrm{v}^{\prime}\right\}{{w with =v} with =v}δ{{w with =v} with =v}
(RD-TRANSPOSE)
for all labels \ell and ℓ^(')\ell^{\prime} such that < < ℓ^(') < ℓ\ell^{\prime}<\ell< and
{ { w with = v } with = v } δ { w with = v } w  with  = v  with  = v δ { w  with  = v } {{w" with "ℓ=v^(')}" with "ℓ=v}rarr"delta"{w" with "ℓ=v}\left\{\left\{\mathrm{w} \text { with } \ell=\mathrm{v}^{\prime}\right\} \text { with } \ell=\mathrm{v}\right\} \xrightarrow{\delta}\{\mathrm{w} \text { with } \ell=\mathrm{v}\}{{w with =v} with =v}δ{w with =v}
(RD-DISCARD)
for all labels \ell. It is now obvious that values are in normal forms, in the sense that explicit fields are never repeated and are always listed in order. Typing rules need not be changed, so requirement (i) of Definition 1.7.6 still holds. Requirement (ii) need to be check, in particular, for the new primitives L L ℓ^(L)\ell^{L}L, which we leave to the reader (the proof for . { } . { } *.{ℓ}\cdot .\{\ell\}.{} should hold unchanged).
1.11.19 Solution: Let map have type Π ( X Y ) Π ( X ) Π ( Y ) Π ( X Y ) Π ( X ) Π ( Y ) Pi(XrarrY)rarr Pi(X)rarr Pi(Y)\Pi(\mathrm{X} \rightarrow \mathrm{Y}) \rightarrow \Pi(\mathrm{X}) \rightarrow \Pi(\mathrm{Y})Π(XY)Π(X)Π(Y), and the following reduction rules in the semantics with normal forms:
map { v with = v } w δ { map v w with = v ( w { } ) } map v { w with = w } δ { map v w with = ( v { } ) w } map { v } { w } δ { v w } map v  with  = v w δ map v w  with  = v ( w { } ) map v w  with  = w δ map v w  with  = ( v { } ) w map { v } { w } δ { v w } {:[map{v^(')" with "ℓ=v}wrarr"delta"{map v^(')w" with "ℓ=v(w*{ℓ})}],[map v{w^(')" with "ℓ=w}rarr"delta"{map vw^(')" with "ℓ=(v*{ℓ})w}],[map{v}{w}rarr"delta"{vw}]:}\begin{gathered} \operatorname{map}\left\{\mathrm{v}^{\prime} \text { with } \ell=\mathrm{v}\right\} \mathrm{w} \xrightarrow{\delta}\left\{\operatorname{map} \mathrm{v}^{\prime} \mathrm{w} \text { with } \ell=\mathrm{v}(\mathrm{w} \cdot\{\ell\})\right\} \\ \operatorname{map} \mathrm{v}\left\{\mathrm{w}^{\prime} \text { with } \ell=\mathrm{w}\right\} \xrightarrow{\delta}\left\{\operatorname{map} \mathrm{v} \mathrm{w}^{\prime} \text { with } \ell=(\mathrm{v} \cdot\{\ell\}) \mathrm{w}\right\} \\ \operatorname{map}\{\mathrm{v}\}\{\mathrm{w}\} \xrightarrow{\delta}\{\mathrm{v} \mathrm{w}\} \end{gathered}map{v with =v}wδ{mapvw with =v(w{})}mapv{w with =w}δ{mapvw with =(v{})w}map{v}{w}δ{vw}
1.11.22 Solution: To ensure that the field is not present in the argument of extension, it sufficies to restrict its the typing asumptions as follows:
with = : X x Y . Π ( : abs ; Y ) X Π ( : pre X ; Y )  with  = : X x Y . Π ( :  abs  ; Y ) X Π :  pre  X ; Y (:*" with "ℓ=*:):AAXx^(')Y.Pi(ℓ:" abs ";Y)rarrX^(')rarr Pi(ℓ:" pre "X^(');Y)\langle\cdot \text { with } \ell=\cdot\rangle: \forall \mathrm{Xx}^{\prime} \mathrm{Y} . \Pi(\ell: \text { abs } ; \mathrm{Y}) \rightarrow \mathrm{X}^{\prime} \rightarrow \Pi\left(\ell: \text { pre } \mathrm{X}^{\prime} ; \mathrm{Y}\right) with =:XxY.Π(: abs ;Y)XΠ(: pre X;Y)
To remove an existing field, we can use the following syntactic sugar:
= def λ v { v with = a b s } : X Y . Π ( : X ; Y ) Π ( : abs ; Y ) =  def  λ v { v  with  = a b s } : X Y . Π ( : X ; Y ) Π ( :  abs  ; Y ) {:[*\\ℓ=^(" def ")lambdav*{v" with "ℓ=abs}],[:AAXY.Pi(ℓ:X;Y)rarr Pi(ℓ:" abs ";Y)]:}\begin{aligned} \cdot \backslash \ell \stackrel{\text { def }}{=} & \lambda \mathrm{v} \cdot\{\mathrm{v} \text { with } \ell=\mathrm{abs}\} \\ & : \forall \mathrm{XY} . \Pi(\ell: \mathrm{X} ; \mathrm{Y}) \rightarrow \Pi(\ell: \text { abs } ; \mathrm{Y}) \end{aligned}= def λv{v with =abs}:XY.Π(:X;Y)Π(: abs ;Y)
The following weaker typing asumption could also be used to ensure that the field is always present before removal:
X Y . Π ( : pre X ; Y ) Π ( : abs ; Y ) X Y . Π ( :  pre  X ; Y ) Π ( :  abs  ; Y ) AA XY.Pi(ℓ:" pre "X;Y)rarr Pi(ℓ:" abs ";Y)\forall X Y . \Pi(\ell: \text { pre } \mathrm{X} ; \mathrm{Y}) \rightarrow \Pi(\ell: \text { abs } ; \mathrm{Y})XY.Π(: pre X;Y)Π(: abs ;Y)
1.11.25 Solution: The proof is similar to 1.11 .16 but slightly more complex because we must also check that labels are defined when accessed, and with subtyping.
We reason simultaneously in both the subtyping model or the equal-only model, that is, we only rely on properties that are valid in both models.
We must first ensure that rules RE-FOUND and RE-FOLLOW respect (Definition 1.7.5).
  • Case RE-Found: See Exercice ??. In line ??, field \ell is pre X 1 X 1 X_(1)\mathrm{X}_{1}X1 instead of X 1 X 1 X_(1)\mathrm{X}_{1}X1 and pre Y 2 Y 2 Y_(2)\mathrm{Y}_{2}Y2 instead of Y 2 Y 2 Y_(2)\mathrm{Y}_{2}Y2 and step ?? also uses covariance of pre.
  • Case RE-Follow The proof is similar.
We must then check that if the configuration F v 1 v k / μ F v 1 v k / μ Fv_(1)dotsv_(k)//muF \mathrm{v}_{1} \ldots \mathrm{v}_{k} / \muFv1vk/μ is is well-typed, then either it is reducible, or it is a value.
We begin by checking that every value that is well-typed with type Π T Π T PiT\Pi \mathrm{T}ΠT is a record value, that is, either of the form \langle\rangle or v v (:v^(''):}\left\langle\mathrm{v}^{\prime \prime}\right.v with = v = v {:ℓ^(')=v^('):)\left.\ell^{\prime}=\mathrm{v}^{\prime}\right\rangle=v. See Exercice 1.11.16.
Next, we note that, according to the constraint generation rules, if the configuration c v 1 v k / μ c v 1 v k / μ cv_(1)dotsv_(k)//muc v_{1} \ldots v_{k} / \mucv1vk/μ is well-typed, then a constraint of the form let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in ( c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] ) c x 1 x k T [ [ v 1 : x 1 ] ] [ [ v k : x k ] ] (c-<=x_(1)rarr dots rarrx_(k)rarrT^^([[)v_(1):x_(1)(]])^^dots^^([[)v_(k):x_(k)(]]))\left(\mathrm{c} \preceq \mathrm{x}_{1} \rightarrow \ldots \rightarrow \mathrm{x}_{k} \rightarrow \mathrm{T} \wedge \llbracket \mathrm{v}_{1}: \mathrm{x}_{1} \rrbracket \wedge \ldots \wedge \llbracket \mathrm{v}_{k}: \mathrm{x}_{k} \rrbracket\right)(cx1xkT[[v1:x1]][[vk:xk]]) is satisfiable. We now reason by cases on c c c\mathrm{c}c.
@\circ Case c c c\mathrm{c}c is \langle\rangle or (:*\langle\cdot with = = ℓ=*:)\ell=\cdot\rangle=. See Exercice 1.11.16.
@\circ Case c c c\mathrm{c}c is *(:ℓ:)\cdot\langle\ell\rangle. We may asume k 1 k 1 k >= 1k \geq 1k1, since otherwise, the expression is a value. Then Γ 0 ( c ) Γ 0 ( c ) Gamma_(0)(c)\Gamma_{0}(\mathrm{c})Γ0(c) is X Y . Π ( X Y . Π ( AAXY.Pi(ℓ\forall \mathrm{XY} . \Pi(\ellXY.Π( : pre X ; Y ) X X ; Y ) X X;Y)rarrX\mathrm{X} ; \mathrm{Y}) \rightarrow \mathrm{X}X;Y)X, so by C-INID and C-ARRow the above constraint entails let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in ( X Y . ( X 1 Π ( : X Y . X 1 Π ( : (EEXY.(X_(1) <= Pi(ℓ::}\left(\exists \mathrm{XY} .\left(\mathrm{X}_{1} \leq \Pi(\ell:\right.\right.(XY.(X1Π(: pre X ; Y ) ) [ [ v 1 : X 1 ] ] ) X ; Y ) [ [ v 1 : X 1 ] ] {:X;Y))^^([[)v_(1):X_(1)(]]))\left.\left.\mathrm{X} ; \mathrm{Y})\right) \wedge \llbracket \mathrm{v}_{1}: \mathrm{X}_{1} \rrbracket\right)X;Y))[[v1:X1]]), which by lemma 1.6.3 entails let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in X Y . [ [ v 1 : Π ( : X Y . [ [ v 1 : Π ( : EEXY.[[v_(1):Pi(ℓ:\exists \mathrm{XY} . \llbracket \mathrm{v}_{1}: \Pi(\ell:XY.[[v1:Π(: pre X ; Y ) ] ] X ; Y ) ] ] X;Y)]]\mathrm{X} ; \mathrm{Y}) \rrbracketX;Y)]]. Thus v 1 v 1 v_(1)\mathrm{v}_{1}v1 is a record value, that is, either of the form \langle\rangle or v v (:v^(''):}\left\langle\mathrm{v}^{\prime \prime}\right.v with = v = v {:ℓ=v^('):)\left.\ell=\mathrm{v}^{\prime}\right\rangle=v. In fact, the former case cannot occur, since let Γ 0 ; ref M Γ 0 ; ref M Gamma_(0);ref M\Gamma_{0} ; \operatorname{ref} MΓ0;refM in X Y . [ [ : Π ( : X Y . [ [ : Π ( : EE XY.[[(::):Pi(ℓ:\exists X Y . \llbracket\langle\rangle: \Pi(\ell:XY.[[:Π(: pre X ; Y)】 entails X Y Π ( a b s ) Π ( X Y Π ( a b s ) Π ( EE XY Pi(del abs) <= Pi(ℓ\exists X Y \Pi(\partial a b s) \leq \Pi(\ellXYΠ(abs)Π( : pre X; ) ) ))) by C-INID and C-IN*, which in turns
entails EE\exists X.abs <=\leq pre X X X\mathrm{X}X by C-Row-DL and covariance of Π Π Pi\PiΠ and \ell. However, this constraint is equivalent to false, because ϕ ( a b s ) ϕ ϕ ( a b s ) ϕ phi(abs) <= phi\phi(\mathrm{abs}) \leq \phiϕ(abs)ϕ (pre X X X\mathrm{X}X ) does not hold in any ground assignment ϕ ϕ phi\phiϕ. Thus v 1 v 1 v_(1)\mathrm{v}_{1}v1 is v v (:v^(''):}\left\langle\mathrm{v}^{\prime \prime}\right.v with = v = v {:ℓ^(')=v^('):)\left.\ell^{\prime}=\mathrm{v}^{\prime}\right\rangle=v and the configuration is reducible to v v v^(')\mathrm{v}^{\prime}v if ℓ^(')\ell^{\prime} is \ell or v v v^('')\mathrm{v}^{\prime \prime}v otherwise.

  1. The (currently unfinished) code that accompanies this chapter may be found at http: //pauillac.inria.fr/ remy/mlrow/. For space reasons, some material, including proofs, exercises, and more, has been left out of this version. In the future, a full version of this chapter that includes the missing material will be made available at the same address. In spite of these omissions, this chapter is still oversize with respect to Benjamin's 100 page barrier: we currently have roughly 135 pages of text and 15 pages of solutions to exercises. We would appreciate comments and suggestions from the proofreaders as to how this chapter could be made shorter, without severely compromising its interest or readability.